# The numeral system of Proto-Niger-Congo

A step-by-step reconstruction

Konstantin Pozdniakov

Niger-Congo Comparative Studies 2

### Niger-Congo Comparative Studies

Chief Editor: Valentin Vydrin (INALCO – LLACAN, CNRS, Paris) Editors: Larry Hyman (University of California, Berkeley), Konstantin Pozdniakov (IUF – INALCO – LLACAN, CNRS, Paris), Guillaume Segerer (LLACAN, CNRS, Paris), John Watters (SIL International, Dallas, Texas).

In this series:


# The numeral system of Proto-Niger-Congo

A step-by-step reconstruction

Konstantin Pozdniakov

Konstantin Pozdniakov. 2018. *The numeral system of Proto-Niger-Congo*: *A step-by-step reconstruction* (Niger-Congo Comparative Studies 2). Berlin: Language Science Press.

This title can be downloaded at: http://langsci-press.org/catalog/book/191 © 2018, Konstantin Pozdniakov Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0): http://creativecommons.org/licenses/by/4.0/ ISBN: 978-3-96110-098-9 (Digital) 978-3-96110-099-6 (Hardcover)

DOI:10.5281/zenodo.1311704 Source code available from www.github.com/langsci/191 Collaborative reading: paperhive.org/documents/remote?type=langsci&id=191

Cover and concept of design: Ulrike Harbort Typesetting: Sebastian Nordhoff Proofreading: Ahmet Bilal Özdemir, Alena Wwitzlack-Makarevich, Amir Ghorbanpour, Aniefon Daniel, Brett Reynolds, Eitan Grossman, Ezekiel Bolaji, Jeroen van de Weijer, Jonathan Brindle, Jean Nitzke, Lynell Zogbo, Rosetta Berger, Valentin Vydrin Fonts: Linux Libertine, Libertinus Math, Arimo, DejaVu Sans Mono Typesetting software: XƎLATEX

Language Science Press Unter den Linden 6 10099 Berlin, Germany langsci-press.org

Storage and cataloguing done by FU Berlin

Ирине Поздняковой





**Appendix A: Groupings of numerals by noun classes in 254 BC languages 315**


## **Acknowledgments**

Today the greatest benefit to being a researcher is the opportunity to directly contact leading specialists in the comparative studies of African languages. Even the best database does not ensure the proper interpretation of the results achieved by other scholars. In the course of my work on this monograph I have benefited from the help of many colleagues, whose comments and suggestions I greatly appreciate. My particular thanks go to Guillaume Segerer (Atlantic languages and RefLex database), Valentin Vydrin (Mande languages), Raymond Boyd (Adamawa languages), Larry Hyman (Bantu languages and Benue-Congo in general), Mark Van de Velde (Bantu languages), Marie-Paule Ferry (Tenda languages), Pascal Boyeldieu (Bua languages and Laal), Marion Cheucle (Bantu A.80), Denis Creissels (Balant), Sylvie Voisin-Nouguier (Buy), Ekaterina Golovko (Baga Fore), Odette Ambouroue (Orungu) and many others. It is a great pleasure for me to thank you all!

My special gratitude is addressed to colleagues who read the first version of the manuscript of the book and made a number of valuable critical remarks. These are members of the editorial board of this series of Niger-Congo Comparative Studies: Valentin Vydrin, Larry Hyman, John Watters and Guillaume Segerer. I tried as much as possible to take their remarks into account. Naturally, all responsibility for inevitable mistakes and shortcomings lies with me.

I should like to express especially my gratitude for Sebastian Nordhoff for the layout of this book. Many thanks for my proofreaders – their comments were very useful for me.

## **Abbreviations**

### **Language groups and proto-languages**


### **Others**


## **1 Introduction**

### **1.1 Niger-Congo: the state of research and the prospects for reconstruction**

It is quite predictable that the title of this book may be met with skepticism by specialists in the comparative-historical studies of African languages. The first question that may arise is whether a Niger-Congo (NC) reconstruction is achievable at all, considered that the reconstruction of proto-languages underlying particular families and their branches has not been completed (or even properly started, as is the case for some groups and branches of NC). Before we turn to the structure of the book, let us try to answer this fundamental question. To do so, it seems reasonable to very briefly outline the present state of affairs in NC comparative studies.

First, it should be noted that presently there is no general scientific discipline such as "NC comparative studies". Instead, there are individual researchers who work on particular families, groups, sub-groups or branches of NC. Among these, comparative-historical Bantu studies has flourished the most. However, the Bantu languages comprise only a branch of the Southern Bantoid languages that (together with Northern Bantoid) go back to Proto-Bantoid. Hence Bantu is merely one of 16–17 Bantoid branches, as can be gleaned from the chart below (Table 1.1).<sup>1</sup>

The progress of comparative-historical studies of the Bantoid languages has been less impressive than that of Bantu studies. Proto-Bantoid, as well as a number of other proto-languages, goes back to the Proto-Eastern-Benue-Congo. In turn, the latter (along with Proto-Western-Benue-Congo and possibly some other languages that do not belong to these two major groups of Benue-Congo) goes

<sup>1</sup>This book does not investigate the genealogical classification of Niger-Congo as a whole, nor of the individual families of this macro-family. The schemes presented here take into account the most well-known classifications (sometimes with small deviations due to the specific purposes of our study). The scheme of Bantoid languages given here is based mainly on the classification in https://mpi-lingweb.shh.mpg.de/numeral/Niger-Congo-Benue-Congo.htm. It generally reproduces John Watters' classification (1989a: 401) with some deviations, which are not considered here.

#### 1 Introduction



back to Proto-Benue-Congo (BC). Hence, the Bantoid branch is merely one of 14–15 branches of Benue-Congo, as demonstrated by the chart below (Table 1.2).

The traditional reconstruction of Proto-BC based on regular correspondences between the proto-languages underlying the separate branches listed in Table 1.2 has developed rapidly in recent years. However (and I hope that my colleagues will take no offence at this statement), despite numerous brilliant studies dealing with the subject, this is still a relatively 'young' science.

Finally, in addition to Proto-BC there are probably more than ten proto-languages underlying other language families that together comprise the Niger-Congo macrofamily (see Table 1.3).

Most of the works presently available in NC comparative studies do not reach beyond this point. Exceptions are rare, and examples of the comparative-historical approach to the NC reconstruction are few. Moreover, the most significant works of this kind (e.g. those of Westermann 1927, Greenberg 1966, Sebeok 1971,

Table 1.2: Benue-Congo languages

The inventory of Benue-Congo groups is given mainly by Williamson 1989b: 266–269. The main difference in Table 1.2 is that Jukunoid is separated from Platoid, which allows us to better compare the forms of numerals of these groups, as well as the fact that Lufu has been added to isolated languages. The division of the BC into the Western and Eastern branches does not always reflect the genealogical characteristics of languages.


#### 1.1 Niger-Congo: the state of research and the prospects for reconstruction

Table 1.3: Niger-Congo languages

The grouping of 12 families of NC into 5 geographical zones is convenient for technical purposes of generalization of data. So, it means nothing else. As for a genealogical tree of NC languages, as of today there are insufficient grounds for creating one, in my opinion.


etc.) are not that recent and usually date to the middle of the 20th century. Comparative studies of the African macro-families had a jump start but nearly had come to little by the end of the 20th century (important works such as Bendor-Samuel 1989 including Williamson 1988; 1989c are few in this period).

So, what happened?

By the 1990s, our knowledge in the field of African languages had begun to grow exponentially. Hundreds of new language descriptions had been published, and the few dozen experts working in NC comparative linguistics were simply unable to digest this avalanche of new information.

The main problem in the 1960s was that we knew too little. From the 1980s on, we have faced the opposite problem: we know "too much". Not only do scholars not have enough time to absorb new results, sometimes they do not even have enough time to acquaint themselves with those results. During the last four decades, amidst this dialogue between linguistic knowledge and language data, African linguists have remained in listening mode. But I am convinced that the time has come for linguists to say something new again. Unlike even ten years ago, today we are well equipped to do so.

First, we have really exceptional databases. The best one is the RefLex database elaborated by Guillaume Segerer (Segerer & Flavier 2011-2018). It contains more than one million words from African languages (2017), and each entry contains a link to a PDF file of the corresponding source page. It provides a huge range of information and is maximally user-friendly to comparative linguists: it can be solicited for establishing regular phonetic correspondences, for reconstruction and for ranking reflexes as well as for various kinds of statistical data analysis. This new database is being constantly updated.

A big database is something much more than just a huge amount of data. When a database reaches certain degree of plenitude with respect to the main families and branches of the NC macro-family, it opens up prospects for both working

#### 1 Introduction

with the distribution of words that do exist and with the distribution of **gaps** in postulated cognates. The distribution of filled cells and lacunes is a powerful tool allowing 1) identification of important innovations, 2) targeted searches for unusual phonetic reflexes, 3) detection of diachronic semantic changes and 4) refinement of genealogical classification.

In my opinion, the opportunity to rely on both the apparent cognates as well as on the missing reflexes of reconstructed prototypes in particular languages dramatically changes the approach to the reconstruction itself.

The following case may serve as an illustration to this statement. Suppose we need to assess one of Greenberg's proposals, e.g. a Niger-Congo root meaning 'hill'. Among the reflexes quoted by Greenberg for this root are: "(2) Busa *kpi* 'mountain', Kweni *kpi* ; (4) Gã *kpɔ*; Gwa *ogba* 'mountain'; (5) Nungu *agbɔ*, Ninzam (Ninzo) *igbu*. Kordofanian: (2) Tagoi *(c)ibe*." (Greenberg 1966: 155). The phonetic correspondences underlying the comparison of these forms will not be discussed here (we will just assume that they are valid), for the main problem is elsewhere. A reader with no access to a representative lexical database on the NC languages is always uncertain about a number of key issues, including:


The RefLex database establishes that:


#### 1.1 Niger-Congo: the state of research and the prospects for reconstruction

prototype. One of these roots is presented in the chart below (Table 1.4) (one could mention some other roots nearby):


Table 1.4: *\*tʊnd* 'hill, mountain' in Niger-Congo

The exact correspondence between Proto-Bantu (*\*tʊ̀ndà*, zones HJKPMNRS > ( ?) *\*dʊ́ndʊ̀*, zones EGHJKLMNRS), Ijo (Ibani *tʊ́ndʊ́*) and Atlantic languages (Atlantic Bak: Manjak *ntʊnda*, Atlantic North: Basari *e-tə́nd*, Bapen *ɛ-tʌnd*, Laala *tundə*, Fula *tulde*, Wolof *tund*) is reason enough to postulate the root *\*tʊnd* 'hill, mountain' at the Proto-NC level, especially since these languages have apparently been out of direct contact.<sup>2</sup> In addition, the absence of this root in Gur-Ubangi-Adamawa may prove to be a shared innovation in these languages.

Using the databases, the focus of our research could be redirected toward the basic meaning of the lexemes (rather than on the occasional phonetic similarities between the forms). This approach may help in answering the following question: if a Proto-NC term for 'mountain, hill' existed, how did it sound? The answer would probably be as follows: this word could sound like *\*tʊnd*, *\*kong/ keng* or *\*kudu* ('hill, rock, stone'), but not like *dima* (PB *\*dɩmà ̀* , zone EGJ), *mut* (Proto-Jukunoid *\*muT*) or *pi* (PB *pɩdɩ ̀* , zone KLMN). *̀*

Upon arriving at these unconventional "results", one could bring them to the attention of specialists in particular NC languages and branches for further evaluation. Without such professional evaluation there can be no hope for success. Moreover, in recent years it has become evident that this evaluation needs to be collaborative (i.e. made by dozens of specialists working together) for the simple reason that today no specialist can be proficient in the languages of more than one or a maximum of two NC families. Hence, it is important that these specialists are asked questions they can answer, so ideally the approach outlined above

<sup>2</sup>We shall repeat that nearby there are some other candidates for 'mountain' in NC, which we do not treat here.

#### 1 Introduction

should be applied to every family within Niger-Congo. For example, according to the etymological database of the Atlantic languages (Pozdniakov & Segerer 2017 3700 cognates) only *\*tʊnd* and *\*thəng* are potentially interpretable as the terms for 'hill, mountain' in Proto-Atlantic.

Initially I thought of numerals as of an ideal group of terms to test this approach. On the one hand, the core group of numerals must have existed in Niger-Congo. On the other hand, they represent a relatively compact lexical-semantic group with minimum potential for semantic shifts. My initial question seemed simple: what is the most probable Proto-Niger-Congo root for 'two'? The term for 'two' (being the only numeral on the Swadesh list) is generally recognized as one of the most persistent numerals. Why not try reconstructing it on the basis of the NC evidence? It appeared, however, that such a reconstruction is beset with difficulties, so what was originally intended as an article turned into this very book. The structure of the book is described in the section below. As I hope to demonstrate, this structure is conditioned by specific issues encountered in the course of the reconstruction of NC numerals.

### **1.2 Sources and the monograph structure**

### **1.2.1 Sources**

Numeral terms included in the majority of lexical sources hold a privileged position. The information pertaining to the Niger-Congo numerals is more than extensive, it is nearly exhaustive. In addition to the above-mentioned RefLex database by Segerer-Flavier which contains over 17,000 entries marked as "numeral" (state April 2017)) a number of other databases with expansive coverage of the Niger-Congo languages are available. One of them is the "Numeral Systems of the World's Languages" database created by Eugene S. L. Chan and edited by Bernard Comrie (Chan) The data regarding the number systems of about 4,300 languages (with hundreds of the Niger-Congo languages among them) is incorporated into it. Two or even three sources (often unique) are accessible for some of the languages via this neatly organized and user-friendly database. Another universal database that provides numerical data is "Numerals 1 to 10 in over 5000 languages" by Rosenfelder. It was consulted to a somewhat lesser extent because it only includes evidence pertaining to the first ten numerals, for which a simplified transcription is used. Finally, a number of unpublished databases that incorporate the evidence of specific Niger-Congo families and groups were consulted, e.g. the etymological databases of Atlantic (PozdniakovSegerer2017) and Mande (Valentin Vydrin).

#### 1.2 Sources and the monograph structure

As a result, a total of 2,200 sources for Niger-Congo languages were used in this study. This raises the issue of references, since it is impossible to provide a complete list of sources for every NC language. The language index at the end of this book lists the nearly 1,000 languages cited. For these 1,000 languages, the main sources I used are indicated in Appendix E. The index of sources in Appendix E is structured according to the NC main families in alphabetical order.

For each language, I provide not only the source(s) that can be found in the bibliography, but also the name of every contributor in Chan's database [Chan]. The list of contributors is many pages long, but their names should be known, even if their data are unpublished. This is the least I can do to express my sincere gratitude to each of them.

### **1.2.2 Monograph structure**

Noun class affixes are present in numerical terms in the majority of the Niger-Congo languages. Many forms that are considered primary at the synchronic level have frozen noun class affixes that are no longer productive. In such cases it is extremely difficult to distinguish the etymological root within a numerical term. Without it, however, both the comparison and reconstruction of roots is impossible. This is why the second chapter of this book is devoted to the study of various uses of noun class markers in numeral terms.

The third chapter deals with the alignment by analogy in numeral systems. As in other languages, numerals represent a lexical-semantic group that is especially subject to alignment by analogy due to its closed structure, where words are associated in a paradigm. A textbook example is the term for 'nine', with Indo-European \***n**- irregularly reflected in Proto-Balto-Slavic as **d**- (Russian *dev'at'* '9' instead of the expected *\*nev'at'*) by analogy with the term for 'ten' (Russian *des'at'* '10'). This yielded a minimum pair *dev'at' ~ des'at'* that forms a "class of the upper numerals" within the first ten. Adjacent numerals may be alined with each other in the NC languages by a similar formal marker. Thus, no satisfactory etymology can be suggested for the forms attested in Mumuye (Adamawa; *ziti* '2' ~ *taːti* '3' ~ *dɛ̀ ̃ːtì* '4') without the analysis of alignment by analogy. The issues pertaining to both detection and analysis of such alignments are addressed in Chapter 3.

Chapter 4 offers a step-by-step reconstruction of number systems of the protolanguages underlying each of the twelve major NC families, on the basis of the step-by-step-reconstruction of numerals within each family. The term "reconstruction" related to numerals throughout this book calls for a definition. As mentioned above, the use of this term has been questioned, mainly because sys-

#### 1 Introduction

tems of regular phonetic correspondences between the languages within NC families remain unknown. This is why Kay Williamson opted for the term *pseudoreconstructions* (marked with # instead of \*): "Reconstructions proposed by their authors as based on regular sound correspondences are preceded by an asterisk. Pseudo-reconstructions based on a quick inspection of a cognate set without working out sound correspondences are proceded by a #" (Williamson 1989b: 251). In his numerous online publications Roger Blench uses # as well, but his terminology is different: he prefers the more neutral term of *quasi-reconstructions*. Modern comparative studies of the NC languages is a relatively young science, so the opposition between "real" and "pseudo-/quasi-" reconstructions seems irrelevant to me at this stage. The more so that nearly all of our reconstructions (maybe with the exception of Bantu and some other branches) should be marked with #, including the large proportion of reconstructions allegedly based on the evidence of historical phonetics. On the other hand, I think that many colleagues would agree with the following statement: although we do not know the regular phonetic correspondences between the languages that belong to different NC families, there is hardly any doubt that the NC root for 'three' sounded something like *tat*.

Throughout this book the term "step-by-step reconstruction of number systems" (e.g in the Atlantic family) is used in reference to the method that includes the following steps:


Chapter 5 deals with the reconstruction of the Proto-Niger-Congo numeral system on the basis of the step-by-step-reconstructions offered in Chapter 4 for each of the twelve major families and a handful of isolates. The reconstruction described in Chapter 5 inspired the analysis of the distribution of reflexes of the

#### 1.2 Sources and the monograph structure

NC proto-forms within each of the twelve families (as well as within the isolates) in order to establish:


The results of this analysis are presented in Chapter 6.

To illustrate the logic of the complex structure of the monograph, let us consider one example.

In Chapter 4, along with other NC families, the numerals of the Atlantic languages are analyzed (§4.12). Atlantic languages are divided into two main groups – North Atlantic (§4.12.1) and Bak Atlantic (§4.12.2).

In Sections §4.12.1.1–§4.12.1.7, systems of numerals are considered consecutively in the seven main subgroups of the North Atlantic languages. In particular, in §4.12.1.3, numerals in the Jaad-Biafada subgroup are considered and it is established that in these languages, for the numeral '10', the form **\****-po* is attested. In the final section of §4.12.1, namely in §4.12.1.8 the forms of numerals in the seven northern subgroups are compared, and in particular it is concluded that for Proto-Northern Atlantic, the most probable reconstruction for the numeral '10' is the reconstruction of *\*pok*.

In Sections §4.12.2.1–§4.12.2.4, the numeral systems in each of the four subgroups of the second Atlantic group, namely Bak, are discussed consecutively. The final section concerning the Bak group (3.12.2.5) concludes that the only candidate for reconstructing '10' in the Proto-Bak (in addition to the possible model 10 = 5 \* 2) is the root \*-*taaj*.

In the final paragraph of §4.12, namely in §4.12.3, the systems of the North Atlantic languages and the Bak Atlantic languages are compared. This paragraph concludes that the comparative evidence points to the total absence of common roots present in both groups. The only exception to this is the root *\*tɔk / \*tVk* 'five'. Accordingly, it is concluded that it is impossible to reconstruct the Proto-Atlantic root for the numeral '10' without the Niger-Congo context.

In Chapter 5, reconstructions for each family are compared. Accordingly, Chapter 5 has a different structure. If in Chapter 4 each of the sections is devoted to a particular family of languages (in particular, §4.12 is devoted to the Atlantic languages), then in Chapter 5 each section is devoted to the prospects for the

#### 1 Introduction

reconstruction of each Niger-Congo numeral. So, in §5.10 all intermediate reconstructions for the numeral '10' are considered. It turns out, in particular, that the form *\*-taaj* reconstructed for '10' in the Proto-Bak does not find parallels in other Niger-Congo branches. In contrast, the root *\*pok* '10', reconstructed for the North Atlantic languages, can be related to the roots reconstructed for the vast majority of Niger-Congo families (it seems to be missing only in Ijo, Dogon and Kordofanian). Based on the NC comparison, the root for '10' is reconstructed as *\*pu / \*fu.*

Chapter 6 traces the history of the numerals of Niger-Congo, reconstructed in Chapter 5, in each individual family of languages. Accordingly, each section, as in Chapter 4, is devoted to one of the NC families. So, §6.12 is devoted to the Atlantic languages. In particular, it is concluded that in the North Atlantic languages the term for '10' has been preserved in three sub-groups (Wolof \**fukk*, Proto-Tenda \**pəxw*, Proto-Jaad-Biafada \**po*). In the other subgroups it is replaced with isolated innovations. The forms of the Bak languages are also innovated.

So, the basic logic of the chosen structure of the book is as follows: we will consistently move from reconstructions in individual families (Chapter 4) to the reconstruction of each Niger-Congo numeral (Chapter 5) and to the interpretation of each individual family in the Niger-Congo context (Chapter 6). We will take into account the provisions formulated in the preliminary chapters concerning noun classes in numerals (Chapter 2) and changes by analogy in systems of numerals (Chapter 3).

## **2 Noun classes in the Niger-Congo numeral systems**

In most NC languages, the numeral stems are combined with noun class markers. More often we are dealing with the dependent markers of noun classes (in particular, in the numeral '1', as well as in the numerals '2'-'5') in those languages where there is an agreement between numerals and nouns. But class markers appear in many languages, even without any agreement. For example, when counting, numerals are often used in a nominal function and include obligatory markers of noun classes. In this case, numerals as nouns and, on the other hand, numerals as proper numerals can have different class markers (and different roots). Thus, in Likile (Bantu C) *li-yɔɔ* 'ten' (Cl5), *mo-túkú / mi-* 'dozen' (cl3 / cl4) (Carrington 1977).

In many languages, nominal classes in numerals are easily recognized. In other languages, as a result of phonetic processes at the junction of CM and numeral stem and/or as a result of changes by analogy in the paradigm of numerals, it might be difficult to determine which noun class is included in the numeral, although we can distinguish a lexical root. Thus, in Lulamoji (Bantu J) in some derivated numerals (*mm-kágá* '60' < *mu-káagá* '6' and *mm-sáánvu* '70' < *musáánvu* '7'), an obscure CM **mm-** is observed (Larry Hyman, p.c.). It is not homorganic, so we can not treat it as cl10. Meanwhile, in the majority of other languages within this group, it is clearly cl10 which is observed in these forms: cf. for example, in Gwere *n* ˙ *kɑ: gɑ* '60', *n* ˙ *sɑnvú* '70', cf. *lù-kúmì* '1000' / *n* ˙ *kúmì, βìβírì* '2000' (clearly cl11 / cl10).<sup>1</sup> Such cases are not sufficiently dramatic for reconstruction.

However, in a number of languages in synchrony we do not have sufficient criteria to decide whether we are dealing with the root of a numeral or with combinations of a root with an archaic noun class marker. In other words, we cannot isolate the root, and therefore we cannot compare it with the roots of other languages. E.g. we posess no formal proof that the Kobiana (Atlantic) term

<sup>1</sup>The irregular allomorph of cl.10 may have arisen as a result of a change by analogy with the basic numeral '6' and '7': **N** homorganic (cl.10) in these derivated forms > **mm-** by analogy with **mu-** (cl.3).

#### 2 Noun classes in the Niger-Congo numeral systems

*sana* 'four' is composed of **sa**- being a class prefix adduced to the lexical base ( **na**). This base is only distinguishable by means of external comparison, although this method alone is admittedly insufficient, since the Kobiana term may as well be interpreted as an innovation (*sana* '4').

In more complicated cases, it should be assumed that a noun class affix replaced one of the segments of the stem, thus becoming an integral part. The Wolof (Atlantic) numerals provide a good example of this phenomenon. The following numerical terms are attested in Wolof at the synchronic level: *ñaar* '2', *ñett* '3', *ñeent* '4'. Normally the noun class affixes are not included in the lexical base in Wolof, so synchronically we do not have to interpret the first consonant of Wolof numerals as a prefix. However, there are a number of important arguments in favor of the presence of the frozen prefix \***Ñ-** in the Wolof numerals. First, these are the only numerals that agree in the **Ñ** class, being one of the two plural noun classes preserved in Wolof (cf. *fukk* 'ten' which agrees in the singular noun class B). Secondly, the forms *yaar* '2' and *yett* '3' (with the initial consonant being identical to the other plural noun class - **Y**) which agree in the **Y** class have been preserved in some Wolof dialects. Finally, as we hope to demonstrate below, the unification of numerals by class in Niger-Congo languages is characteristic of terms covering the sequence from 'two' to 'four'. Thus, in the diachronic perspective, the consonants in question should be viewed as characteristic of class markers rather than stem segments. However, if this assumption is correct, we are forced to conclude that these markers have been integrated into the stem, having replaced the original initial consonants of the terms in question, the more so that VC-roots are uncommon in Wolof (numerical roots most probably had CVC structure, see Pozdniakov & Robert 2015: 615–616). This means that the Wolof terms are of little significance for the reconstruction of the terms for '2–4' in Proto-Atlantic.

Most of the issues (theoretical ones included) that have complicated our reconstruction while studying noun classes in the families and branches of Niger-Congo pertain to the relationship of noun classes and numerals at the synchronic level. These problems are often left aside in the grammatical descriptions and do not attract sufficient attention from linguists. I am not aware of any work which discusses them systematically. Meanwhile, I am sure this question is worthy of attentive study because it reveals additional characteristics of noun class systems.

The first five numerals in Niger-Congo usually agree with nouns, for example in Sereer: *o-koor o-leng* 'one man', *a-koy a-leng* 'one monkey', *Ø-naak Ø-leng* 'one cow'. In some languages and branches of the macro-family, the inventory of numerals that show agreement is reduced.

As noted, the noun class marker may appear in numerals in some contexts which are not related to the agreement.

1. For instance, for counting, the majority of languages include a class marker (CM); moreover, different numerals may have different affixes. For example, in Biafada for the numerals '1', '6–7' the class **N** is used, for '2–4' the class **bi-**, **ɡə** – for '5', **Ø** – for '8–9', **ba** – for '10'.

A lot of languages use CM in numerals starting from '6' and higher, that is in the numerals that do not show agreement in class, and not only in counting. For example, in Manjak *ngə-bʊs ngə̀-təb* 'two dogs' (agreement), *ngə-bʊs ʊ̀-ntaja* 'ten dogs' (lack of agreement, numeral '10' with CL **ʊ̀-** is used in an independent form).

The choice of the noun class for numerals in the two aforementioned contexts (in counting forms, and in numerals with no agreement) represent a very interesting case which I will outline hereinafter.

2. The interaction between noun classes and numerals cannot be limited to the aforementioned contexts. Noun classes emerge as well in derived numerals. The three main cases will be highlighted as follows.

Firstly, in the majority of Niger-Congo languages (and, apparently, even in Proto-Niger-Congo) the numeral '8' was formed from '4' by the reduplication of the first syllable of the original root \*CL-*na(h)i* '4' > \*CL-*na-na(h)i* '8'. Often the noun class marker of '4' and '8' coincides, but sometimes they do not. A question therefore arises: which factors define the choice of a noun class in a derived numeral?

Secondly, the Niger-Congo languages use compound numerals extensively, as do the majority of languages in the world. For example, the numeral '40' is formed following the model '40' = '4\*10' (in many Bantu languages, for instance) or '40' = '20\*2' (in the majority of Atlantic languages). The latter model is based on finger-counting, when two hands and two feet give a sum of 20. The numeral '20' goes back to the lexeme 'chief' or 'man'. In these languages the numeral '15' is often formed following the model 'two hands and one foot'. This model is well known and is discussed in the literature. However, the question of the choice of noun class in the first and second formative of these compound numerals was often left aside. Meanwhile, this question needs more clarification. The following questions will be discussed in the present study.

#### 2 Noun classes in the Niger-Congo numeral systems

In a compound numeral, for example, '20' = '10\*2', the class marker is often absent in the second formative. For example, in Bomwali (Bantu, A80) we have: *Ø-kamɔ* '10' (cl9),<sup>2</sup> *ɓe-ɓa* '2' (cl2), *mɔ-kamɔ Ø-ɓa* '20'. In this type of language, we have additional causes to discuss derivative words rather than syntagms.

In a compound numeral, both formatives include class markers, for example, '20' = 'CL-10\*CL-2'. The CM can be different or the same in the two formatives: Pinji (B30) *n-dzìmà dí-bàlè* '20' (10\*2), Nsong (B80) *ma-kwǐm m-ɔːl* '20' (10\*2). In the latter case, a particular type of *agreement* can be observed, that is, the second formative agrees in class with the first formative.

If in a compound numeral both formatives include class markers, as in '20' = 'CL-10\*CL-2' then theoretically we can expect that the noun class of the first formative will coincide with the class of the independent numeral '10'. This strategy is very rare. One of the unique examples comes from Moghamo (Grassfields) *ì-ɣùm-bē* '20' (*ì-ɣùm* '10', *í-bē* '2'). In the majority of cases the noun classes of the two formatives do not coincide. For instance, in the same branch of Benue-Congo (Grassfields): Laimbue *mɨ̀ ɣɨm-bò ́* '20' (*ɨ-ɣɨḿ* '10', *bò* '2'), the number '10' changes its class, being part of the first formative of the numeral '20'. The interpretation of this strategy in Niger-Congo languages will be given later. The same problem arises with the second formative. Very rarely does its class coincide with the noun class of the initial numeral (in the present case we deal with the numeral '2'). In the majority of cases it differs. The cause is, as it was already mentioned above, that the second formative agrees with the first one. For example, in the same group of languages (Grassfields): Mundani *è-ɣɛm ye-be* '20' (*è-ɣɛm* '10', *be-be* '2'). In some languages, noun classes of simple and compound forms differ even if agreement is absent.

3. Finally, the strategy of forming numerals only by the change of the noun class and with no changes in the lexical root represents a real parade of paradigmatic values of noun classes in numerals. This strategy was system-

<sup>2</sup> For a reader who is not aware of the tradition of Bantu linguistics, it is necessary to explain that in Bantu languages there is a stable inventory of noun classes, each having a fixed number. The ongoing numeration of Bantu was found useful for the study of noun classes in Niger-Congo in general, where the numeration of classes of non-Bantu languages represents a concrete etymological hypothesis. If a scholar assigns the number '6' to the class -**ɗam** of Fula (Atlantic language), it means that etymologically it should be related to the class **\*ma** (CL 6N) of Proto-Bantu.

2.1 Noun classes in the counting forms of numerals

atically developed in one zone of Bantu languages, that is zone J (although it can be encountered sporadically in some other Niger-Congo languages). For example, in Chiga (Bantu J): *ì-βìɾí* '2' > *ɑ̀ː-βìɾí* '20' ; *mù-kɑ̂ːɡɑ̀* '6' > *ŋ -kɑ̂ːɡɑ̀* '60', *mù-nɑ̂ːnɑ̀* '8' > *kì-nɑ̂ːnɑ̀* '80'.

It is interesting that the same language combines all three strategies. Thus, in Chiga:

˙


### **2.1 Noun classes in the counting forms of numerals**

In some Niger-Congo languages, numerals do not have noun class markers in the counting form, but the number of these languages is very low. In the Atlantic family the only language with this feature is Balant. In the majority of Niger-Congo languages while naming a numeral (for example, in counting) noun class markers are used. These markers may be the same for all numerals, but this is a rare case. More often, for the numerals 1–10 there are three to four different markers (furthermore, special class markers may be used for the numerals '20', '100', '200' and others).

A fragment of the Tetela (C80) numeral system is presented below (Table 2.1).

We see here a variety of classes as well as plenty of mini-clusters (note the noun class switch that occurs when a number becomes a part of a compound term; this phenomenon is characteristic of the Niger-Congo languages). The terms for 'one' (**ó**- class), 'hundred' (**lo**-) and 'thousand' (**ki**-) appear to be isolated on account of their noun class. At the same time, the following groups of terms are distinguishable: '2–3' (**ha**-), '4–6/20' (**a**-, «/» refers to the grouping of nonadjacent numerals), '7–8' (**e**-), and '9–10' (**di**-). It should be noted, however, that


Table 2.1: Tetela numerals

even in such systems some numerals can be used without noun class markers ('2000').

Three issues need to be mentioned here.

The noun class markers are easily distinguishable in Tetela. However, for the majority of the NC languages (especially the non-Bantu ones) this is not the case. The criteria that would allow for distinguishing between the markers and the segments of stems are often lacking, which means that we have no idea which stem in a language under study is to be used for comparative purposes. The situation is even more grave in those numerous cases where an additional class marker is added to a numeral which contains an archaic class marker integrated in a stem.

The mechanism underlying the grouping of numerals into the mini-clusters (by including them in a common noun class) remains virtually unexplored, although it is certainly worthy of investigation and thorough consideration from the theoretical point of view. What was the motivation behind the use of the class marker **ha**- with the Tetela terms for 'two' and 'three', while in case of 'nine' and 'ten' the class marker **di**- was preferred in this language? The answer to this question is probably not to be sought within the semantics of a given noun class. On closer examination, the choice of a noun class in such distributions is often unmotivated by anything other than the need to formally distinguish a group of numerals (as opposed to other groups). In this respect, this mechanism is very similar to the alignment by analogy as applied to numerals in many languages. This strategy (implying an irregular alteration of a part of a lexical stem) can be compared to a radical surgery, which is never an easy option. Languages with noun classes have less traumatic means to achieve the same result, e.g. by using different noun class markers to distinguish between the groups of numerals. This elaborate marking technique is widely attested in the Niger-Congo lan-

#### 2.1 Noun classes in the counting forms of numerals

guages. The grouping of numerals is typologically interesting as well: some of the groups are fairly common whereas some are quite rare. Moreover, it is probable that these groups were formed independently in different languages: a situation where a pair of closely related languages exhibit radically different grouping and vice versa is not uncommon.

Some numerals are not normally subject to grouping and tend to be marked with a specific noun class, thus standing in opposition to the rest of the numerical terms. The use of this specific class is especially frequent with the terms for 'one', 'hundred' and 'thousand', cf. e.g. specific noun classes observable in the Tetela terms for 'one' (**ó**- ) and 'hundred' **(lo**-).

Let's look at the distribution of numerals in noun classes for the languages where this information is available. This observation will be made on a selection of 254 Benue-Congo languages (among these, 166 are Bantu languages, evenly distributed by zones). Our sampling comprises languages that are known to employ noun classes on the numerical terms used in counting.

### **2.1.1 The specific marking of numerals**

As mentioned above, specific noun classes are used with the terms for 'one' and 'ten' especially often: 174 languages out of 254 mark the numeral '1' in a distinguished way, and 151 languages mark the numeral '10' separately.

Examples of systems with the term for 'one' being in opposition to the rest of the numerals (marked with a different noun class)<sup>3</sup> are provided below (Table 2.2).

Examples of one other strategy (the term for 'ten' being a noun remains in opposition to the rest of the numerals by means of a noun class) are given in Table 2.3.

Another strategy with the terms covering the sequence from 'two' to 'nine' being opposed to the terms for 'one' and 'ten' is characteristic of the languages represented in Table 2.4.

However, the terms for 'one' and 'ten' can form a group opposed (by means of a noun class) to the rest of the numerals (Table 2.5).

With the exception of the terms for 'one' and 'ten', a specific marking of numerals by means of a noun class is rarely attested. A specific noun class (different from noun classes in other numerals) was found in only 6 languages for the numeral '3', and in only 7 for the numeral '4'. It should be noted, however, that a specific marker is often employed for the terms within the sequence from 'six'

<sup>3</sup>Considering the fact that numerals '2–9' belong to the same noun class, the numerals '6–9' are not included in Tables 2.2–2.5.

#### 2 Noun classes in the Niger-Congo numeral systems

Branch Language '1' '2' '3' '4' '5' '10' J30 Nyole **ndala** ebiri edatu ené etaanu ehúmi njereere Defoid Ede Ica **ɔkɔ̃** eɟi ɛta ɛ̃ɛ̃ ɛwu ɛya Defoid Ede (dial.) **ɔ̀kɛ̃** mɛ̃ ́d͡ʒì mɛ̃ ́ta mɛ̃ ́hɛ̃ mɛ́hú mɛ̃ ́wá Defoid Ifè **ɛ̀nɛ / ɔ̀kɔ̃ ̀** méèdzì mɛ́ɛta mɛ́ɛrɛ̃ mɛ́ɛrú maá Mbe Mbe **ómè** bɛ́pʷâl bɛ́sá bɛ́ñî bɛ́tʃân bɛ́fwɔ̂r Mbam Nomaande **ɔmɔtɛ́** béfendí batátɔ́ bényíse batáánɔ́ bɔ́ɔ́háta Mbam Tuotomb **ɔ́mɔ̀** pɛ́fáⁿd pɛ́dààt pínìs pɛ́tàn pʷówàt Mbam Tuki **umwêːsií** mówá mótátó mwéːné motáːnó mwábɔ́tɔ́ Mbam Yambeta **ímùʔ** mɔ́bààn mɔ́dáád múnìʔ mɔ́táàn mɔ́wád Mbam Nubaca **pòmóhò** mʷǎntʃì mùtát mùɲíhì mùtâːn mʷapʷat Mbam Yangben **pùmòm** mándɛ̀ matát ménì mátàn mát Mbam Numaala **bùmʷòm** mâːndɛ̀ mádád̥ɔ̀ ménî mátʰán mátʰ Mamfe Denya **ɡɛ́mâ** ópéá ólɛ́ ónì ótà ófíà

Table 2.2: Specific noun classes in '1'

Table 2.3: Specific noun classes in '10'


to 'nine', e.g. the term for 'nine' bears a specific noun class marker in the 151 languages under study.

### **2.1.2 The grouping of numerals by noun class**

Adjacent numerals are more often grouped by their noun classes. Among different numeral grouping types, several are diffused across all main branches of Benue-Congo. I will list 15 of the more frequent groupings of numerals and illustrate each of them with an example. These groupings are reported in Table 2.6.

Even limiting Table 2.6 to 15 groupings demonstrates the fact that some numerals (for example, '2') are grouped by noun class more often than other numerals (for example, '8'). By analyzing the whole table of groupings (reported in Appendix A-B), the following observations can be made regarding each numeral.


Table 2.4: Common noun classes for '2'-'9'

Table 2.5: Common noun classes for '1' and '10'


**Numeral '1'.** Groupings of the numeral '1' are relatively rare: the majority of languages, obviously, prefer to oppose '1' to all other numerals. In case it is grouped with other numerals, the most frequent grouping is within the first five ('1–5') or six ('1–6') numerals. In the analyzed database there are four languages which differentiate the first two numerals '1–2'. For instance, Ngoreme (Bantu-E10): *e-mʷe* '1', *e-beɾe* '2', but *i-satɔ* '3', in Gitonga (S60) *mwéyò* '1', *mbìlì* '2', but *dzì-ná* '4'.

**Numeral '2'.** The numeral '2' reveals the maximum predisposition to groupings. The most frequent are: '2–5' and '2–6'. The grouping '2–4' is significantly less

#### 2 Noun classes in the Niger-Congo numeral systems

frequent but remains present in the majority of Bantu zones and in other groups of Benue-Congo languages.

**Numeral '3'.** '3' is often found in groupings but is very rarely opposed by noun class to '2'. However, some very interesting examples exist. For example, Mbuun (Bantu-B80): *umwɛ́s* '1', *byɛ̌l* '2', *í-tár* '3', *í-na* '4', *í-tân* '5'. It is worth mentioning that grouping of '3–8' and '3–10' were not encountered in any of the languages examined.

**Numerals '4' and '5'.** The only frequent grouping involving '4' is '2–4' (except groupings that include four numerals or more) and for '5' it is '2–5' or '2–5/10'. The grouping '5–9' was encountered only in five languages and the grouping '5– 10' and '5–8' (in combination '5–8/10' – only in one language. The lack of a frequent grouping of '5–9' can seem even more strange because in many languages numerals '6–9' are based on 5 (moreover, this type of derivational model can be reconstructed for Proto-Bantu and, perhaps, for Proto-Benue- Congo, with the sole exception of the numeral '8' which was apparently formed from '4'). Another unexpected case is the lack of grouping for '5/10', that is the lack of a specific class for '5' and '10', considering the fact that in many languages '10' is formed from '5'. This model was encountered only in one dialect of Eggon: *ò-tnó* '5', and *ó-kpo* '10', while in other numerals the noun class is not marked (I am not aware whether the different tone on the prefix indicates a different noun class).

**Numeral '6'.** A high number of groupings of '6–9' is natural. In many languages it becomes '6–8' because of the specific derivation of the number '9'. In contrast, groupings '6–10' are very rare.

**Numeral '7'.** It is worth mentioning the frequent grouping of '7–8' (21 languages). We are dealing not with one concrete class in Benue-Congo but rather a similar way of marking the numerals '7' and '8'. In the three examples reported in Table 2.3 the presumably common cl7 (Cilungu **tʃí-**, Sakata **ke-**, Xhosa **si-**) was found, in other languages a number of different classes can be encountered (Table 2.7).

**Numerals '8', '9', '10'.** The same charactetistic is typical for the frequent groupings of '8–9' and '9–10', shown in Tables 2.8–2.9.

Table 2.6: The most frequent groupings of numerals based on noun classes in Benue-Congo languages


#### 2.1 Noun classes in the counting forms of numerals

and '9' (class **ɛ-̀** ), and also '7–8' and '10' (class **à-**).


Table 2.7: Groupings of '7'-'8' by noun classes

Table 2.8: Groupings of '8'-'9' by noun classes


Table 2.9: Groupings of '9'-'10' by noun classes


2.2 Noun classes in derived (reduplicated) numerals

### **2.2 Noun classes in derived (reduplicated) numerals**

Reduplication is widely attested as a means of constructing numerical compounds in NC. This is especially applicable to the pattern '8 = 4 redupl.' which, as we hope to demonstrate below, can be reconstructed at the Proto-Niger-Congo level. Another common pattern (attested, however, with a somewhat lesser degree of frequency) is '6 = 3 redupl.'. Three main strategies pertaining to the use of the noun classes are employed within this derivation scenario:


The number of these strategies is reduced to two in cases where a derived term is non-separable (e.g. derived by partial reduplication). In such cases, the class marker of the source-term can be either employed (Kikuyu *i-tatu* '3' > *i-tatatu* '6'), or not (Vinza *ka-ne* '4' > *mu-nane* '8').

We might expect that while forming '8' from '4', the singular class of the latter would be switched to the plural class of the former. In Bantu languages, however, this is not the case. Apparently already in Proto-Bantu we should reconstruct the derivational model *\*ì-nàì* '4' (cl.sg.5) > *\*mʊ̀-nànàì* '8' (cl.sg.3). However, from an etymological point of view, the class **mu-** represents the reflex of the class 6B.pl and not a reflex of the class 3.sg in Niger-Congo. This question raises an additional and very important topic which cannot be examined in the present study (the arguments in favor of class 6B.pl **mu** in Proto-Niger-Congo can be found in Pozdniakov 2013).

**Bantu languages.** The following presents partial data on the numeral system in Myene (B10)<sup>4</sup> (Table 2.10).

First of all, it is interesting to highlight a variety of noun classes in the left column of the table and their uniformity in the right one. In the numerals from

<sup>4</sup>Thanks to Odette Ambouroué for some clarifications and for a profiatable discussion on noun classes in Myene.


Table 2.10: Myene numerals

1 to 10, the system includes four different singular noun classes: **N-** (cl9) – '1–4', **ò-** (cl3) – '5–7' (the numeral '7' is formed as '6+1', where *nómò* means «the only one, the same»), **è-** (cl7) – '8–9' (the numeral '8' is a reduplicated form of '4', the numeral '9' is formed as '9 = 10 – 1') and finally, **ì-** (cl5) – '10'. A homorganic nasal can be quite reliably reconstructed in '1–4', sometimes appealing to indirect characteristics. For example, in *tʃáɾó* '3' the nasal is absent but in Myene **tʃ-** is not a reflex of **\*t.** In this language **\*t-** > **r-**, as can also be seen in the second formative of '30'. The initial **tʃ-** can be traced back to **\*N-r-**.

In numerals of dozens only cl6 **à-** is used, which is one of the plural classes (with a collective meaning). An interesting detail: in '20' – '50' the second formative agrees with the first one in noun class (**á-**), and in '60' – '90' there is no agreement (the second formative maintains noun classes which mark the units as in independent forms; its high tone is due to the high tone in the preceding root *ɣóm*).

Non-derived numeral '100' belongs, as '1', to the singular class cl9. Does the second formative of '200' agree with the first one? It is impossible to say, because the noun classes of both formatives coincide when used singularly.

Finally, it is possible to formulate the principle of derivation with reference to the noun classes: the numeral '10', being a formative of numerals '20' – '90', maintains its meaning but changes the singular noun class to a plural noun class following the most standard sg ~ pl correlation in the language. For cl.sg.5 (**ì-** in Myene) which is expressed through *ì-ɣómí* '10', the standard correlate is cl.pl.6 (**à-**). Concerning the second correlate (units), it agrees with the first one (dozens)

#### 2.2 Noun classes in derived (reduplicated) numerals

in the numerals that even in independent use show agreement with nouns (in Bantu numerals '1–5' show agreement with nouns). For this reason in numerals '20'–'50' units from '2' to '5' agree with '10' in its plural form and in '60'–'90' second formatives '6'-'9' do not show agreement.

If we confront the numeric characteristics of simple and derived forms, the formation of numerals in Myene can be represented by sg > pl-pl and numerals '60' – '90' by sg > pl-sg.

This system is quite typical for Bantu languages, although the variation is considerable. The main variations are illustrated in Table 2.11, including languages only from the zone J.

Table 2.11: Number patterns in derived numerals


The Hema example demonstrates that the pluralization of the class for the formation of derived numerals is not mandatory (at least, for hundreds and thousands), although it unconditionally dominates in the languages of this group (Shi, Chiga, Ganda, Soga). If the simple numeral is already marked for plural class (there are examples demonstrating this), the first formative of the derived numeral appears with a new plural class (for example, in Shi). In the combination sg > pl-pl the plural classes in a composed derived numeral can be different (Ganda, derivation '1000' > '2000').

While forming a word combination from one word, the number of possible combinations of singular and plural classes amounts to eight. As shown in the table, only four of these combinations are actually encountered. No languages show combinations sg > sg-sg, pl > sg-sg, pl > sg-pl, pl > pl-sg This distri-

#### 2 Noun classes in the Niger-Congo numeral systems

bution demonstrates how pluralization is used for the formation of numerals of higher rank. This strategy can be systematically found in other branches of Niger-Congo.

**Atlantic languages.** In order to be able to compare the principles of derivation of numerals in Bantu and in Atlantic languages systematically, we need to first formulate at least three main differences between these systems.

First of all, it is important to highlight that the system of Bantu is decimal, which is not typical for other branches of Niger-Congo, nor for other branches of Benue-Congo. The overwhelming majority of Altantic languages are '20'-based and not decimal. In these languages, accordingly, '40 = 20\*2' (and often '100 = 20\*5') and very rarely '40 = 10\*4'.

Secondly, in Atlantic languages the numerals '6–9' are systematically formed following the model '5' + '1, 2, 3, 4'. This model does not permit the change of noun classes for the numerals '6–7' and/or '7–9'. The numerals '6–9' maintain all the characteristics of '5' (first formative) and '1–4' (second formative).

Thirdly, contrary to Bantu, the majority of forms of '5' are formed from the lexeme 'hand', maintaining the noun class of this lexeme. In Proto-Bantu 'hand' and 'five' are reconstructed as different roots.

The sum of the abovementioned factors explains the fact that noun classes in the numerals '6–9' are of no concern to the present study. Nonetheless, as will be further demonstrated, the main principle of interaction between noun classes and numbers in the numeral system of Atlantic languages is similar to that of Bantu.

Apparently, derived numerals were already formed following the model '40 = 20\*2', '60 = 20\*3', '80 = 20\*4' in Proto-Atlantic. Different strategies of agreement are partially shown in the table (Table 2.12, (only the most simple cases were reported).


Table 2.12: Atlantic languages: noun classes in the derived numerals

#### 2.2 Noun classes in derived (reduplicated) numerals

As demonstrated in Table 2.12, the majority of Atlantic languages within the Bak branch (Bijogo, Banjal, Kasa, Bayot) show that in the numeral '40' ('60', '80') the units '2' ('3', '4') agree in general according to a plural class and not according to the class of the numeral '20'. The same principle is characteristic for the languages of Benue-Congo. In all four abovementioned languages, the formation of '40' is based on the agreement in number as for animated nouns cl1.sg – cl2.pl (this is very clear especially knowing the etymology of the numeral '20').

Pluralization as a form of derivation is used when the form of the numeral '20' is not transparent (Kwaatay *butuman* '20', unclear etymology, Nyun Gunyamolo *buruhur* '20' (possibly from «price + man»); in the numeral '40' lexemes are used with the meaning 'people'). In some languages (Karon) the agreement is based on the singular class of the numeral '20' and not on its plural correlate.

In Atlantic languages that, like Bantu, systematically follow the decimal system, the pluralization of the class permits the formation of new numerals (more often as word combinations) (Table 2.13).


Table 2.13: Agreement in numerals derived from '10'

In such cases agreement of the formatives can be observed, that is the same noun class is used for dozens and units. In the languages where '20' is formed from '10' (10\*2), the units more often do not show agreement:


Even in the following case the use of a plural class for units is possible: Baga Fore *ɛ-tɛlɛ* '10', *ɛ-tɛlɛ mɛn-di* '20' (*ʃi-di* '2'), *ɛ-tɛlɛ mɛ-nɛŋ* '40' (*ʃi-nɛŋ* '4').

#### 2 Noun classes in the Niger-Congo numeral systems

Finally, in order to complete the description, hybrid composed forms will be reported, that is when '40' can be traced the root '20' and not '10' but in units where '4' is used and not '2'. This means that in '20' – '90' the root '10' is used, which is different from the main root:


In spite of plurality of strategies, the modern systems of agreement of units in the dozens reflect a significant distinction that is characteristic of the two main branches of Atlantic languages – Northern and Bak. Apparently, the protolanguages of the Bak group maintained the principle of agreement which was typical for Proto-Niger-Congo, that is, the agreement of units following the plural correlate of '10' or '20'. This principle was lost in the system of the Northern branch, where it can be encountered in only one of the Tenda languages, Basari. It is also present in Nyun Gunyamolo, but in this language, as it is highlighted by different scholars, the numeral '20' (and probably the whole agreement model) is borrowed from Joola (Bak).

The model of agreement in '200'/ '2000' works in a similar way, as shown in Table 2.14.

Table 2.14: Agreement in '200' and '2000'


As observed for dozens, the agreement in '200' and '2000' can be systematically observed only in the languages of the Bak group (languages 1–5 in Table 2.14). In the Northern group this agreement is found only in Basari (7). Even

#### 2.2 Noun classes in derived (reduplicated) numerals

in Konyagi, the fact of agreement is not clear because in this language the CM of '2' in '200' and '2000' coincides with the CM of cl2 in independent use (for the same reason it is not clear whether we encounter agreement in Baga Foré (5). Moverover, there is no agreement in Nalu (6), a language of the same branch.

In the majority of languages, the noun classes of '200' and '2000' systematically differ from the noun classes of units and dozens. This is typical for Niger-Congo, perhaps because in '100'/'200' and '1000'/'2000' we are often dealing with borrowings.

**Mel languages.** The present analysis will be limited to the data from one Mel language, that is Temne (Kərata dialect) collected by David Odden (Table 2.15).


Table 2.15: Noun classes in Temne numerals

The numerals '1–4' in counting forms belong to cl.sg **pV-**. The numeral '5' can be traced back to the form with positive meaning of definiteness (*\*ta-tam-at*) – as well as 10 (< *\*ta-fu-at*), initially having the structure CV-CVC-VC, where CVand -VC are allomorphs of the noun class in a definite form and CVC is the root (Pozdniakov 1993: 143–144).<sup>5</sup> For us, it is important that the numerals in '5' and '10' can be reconstructed with cl.sg **ta-**. The non-derived numeral '20' can be traced to cl.sg, and in particular **kə-**. The numerals '40' – '90' are formed with the change of the noun class in the first formative to cl.pl **tə-**. Furthermore, the second formative agrees with the first one in noun class and consequently is also included in the class **tə-**. That is to say, this is the same derivational model as in

<sup>5</sup> It is clear that '5' and 'hand' have assonance in the languages of the group. Due to space limitations, it is impossible to explain the complicated emergence of this assonance. Let's also leave aside details on the first formative in the numerals '6–9'.

#### 2 Noun classes in the Niger-Congo numeral systems

Bantu and in Atlantic languages. This model emerges as well in the formation of '100' and '200'. In the borrowed form *kɛmɛ* '100' the initial root consonant can be interpreted as a singular CM (the same noun class as in '20'). That means that '200' is used as its plural correlate and the original root consonant gives us **t-**. Finally, the correlation of '1000' ~ '2000' can be interpreted as correlation in number but with a new pair of classes: cl.sg **ʌ**- ~ cl.pl **ɛ-**.

**Gur languages.** An example of an interesting system from the Ditammari language (Oti-Volta) follows (Table 2.16).



In this example we can see the correlation of number classes in derivatives and «agreement» between the parts of syntagm in '200' and '2000' using different structures of class markers (prefixes, suffixes, confixes, or the lack of marker).

Similar formation strategies of derived forms can be found in another language from the Gurma group (Oti-Volta), Miyobe (Table 2.17).

Table 2.17: Miyobe: noun classes in derived numerals


In '20' (10\*2) and in '2000' (1000\*2) a plural correlate cl.sg **kV-**(cl.pl **ɑ́-**) is used. In '2000' the numeral '2' agrees in noun class with '1000' (the root is formed from the word with the meaning 'sack'). In '200' the reduplication of '100' and a special class marker (cl.pl **mɛ**) for the formative '2' are used.

#### 2.2 Noun classes in derived (reduplicated) numerals

Another language from Gurma group Ntcham follows the same standard model (Table 2.18).


Table 2.18: Ntcham: noun classes in derived numerals

The numeral '200' is formed from '100' by changing from the singular class to the plural one.

The existence of similar strategies for use of plural class markers for the formation of numerals of higher rank in different areas of Niger-Congo (Benue-Congo, Atlantic languages, Mel languages and Gur languages) permits us to presume that similar principles of interaction between noun classes and numbers were typical for the system of Niger-Congo as well. There are no traces of derivative pluralization in Kru and Ijo languages, but they can surely be found in Kwa languages. I did not manage to find similar strategies in the Adamawa and Ubangi languages, nonetheless traces can be found in Kordofanian languages.

Here is an example from Koalib, a Kordofanian language (Table 2.19).

Table 2.19: Koalib example


A prefix for the plural class is used for the formation of the numeral 40. The formative '2' in '40' agrees with the formative '20' in the noun class. In '200' the prefix of singular class cl1 is used, which includes animated nouns and borrowings. In '2000', in the formative '2' is used for the prefix **w-**, a standard agreeement marker for vocalic noun classes.

Traces of pluralization of noun classes as a means of derivation in numerals can be found in Moro and Acheron (both are Kordofanian languages).

This distribution gives us sufficient grounds to assume that derivation for the formation of dozens in Niger-Congo was similarly established in Proto-Niger-Congo.

#### 2 Noun classes in the Niger-Congo numeral systems

### **2.3 Noun class as a tool for the formation of numerals**

Finally, there is one (perhaps the most interesting) strategy for formation for derived numerals. It consists exclusively of changing the noun class for the formation of a derived form. The system from Efik is partially reported below (Table 2.20).

Table 2.20: Efik example


In Efik, as in the majority of Niger-Congo languages, a stable correlation in number cl5.sg ~ cl6.pl can be found: in Efik reflexes of these classes are accordingly **í-** ~ **à-**. A simple change of singular class to plural (with no compound forms and no reduplication) is enough to form '40' from '2', '60' from '3' and '80' from '4'. Apparently, this system uses '20' as its primary base.

The formation of new numerals by a change in noun class can be encountered in some languages of Benue-Congo, including Bantu (Table 2.21).


Table 2.21: Benue-Congo examples

This technique is mostly used in Bantu languages within the zone J. The data reported in Table 2.18 does not necessarily signify that the conceptual base for derivation is the pluralization of original forms. In Tiene, Sengele, and Ndengese, derived numerals, as well as base numerals, belong to singular noun classes.

For example, for the languages J10 sg > sg is characteristic for four derivations which can be illustrated by Gundu language (Table 2.22).

Other derivations sg > sg can be found occasionally. Apparently, the forms *ndatu* '6' > *tʃí-ɾatu* '60' (cl9 > cl7) and *mú-nanɛ* '8' > *lú-nanɛ* '80' (cl3 > cl11) were encountered only in Tembo (J50). We can see that the choice of nominal classes

#### 2.3 Noun class as a tool for the formation of numerals


Table 2.22: Gundu number patterns in the derivations of numerals

differs in different languages, that is, it is not the symbolic semantics of nominal classes that is most important, but rather their paradigmatic modification.

In Bantu J10-J20 we find a triple derivation model cl5-*kumi* (or cl9-) '10' ~ cl7 *kumi* '100' ~ cl11-*kumi* '1000'. Thus in Hema, *i-kumi* '10' ~ *ki-kumi* '100' ~ *ru-kumi* '1000'.

This model can be found in Gur languages as well. In Nothern Nuni (Grusi group) dozens are formed exclusively by a change in noun class marker. The derivation from '20' to '50' is realized by the change of one singular class to another: *bì-lə̀* '2' > *fíì-lə̀* '20', *bì-twàà* '3' > *fíì-twàà* '30', *bì-nu* '5' > *fíì-nu* '50'. Formation of dozens by a change of class is encountered in some Senufo languages as well.

However, the derivational model sg > pl is much more active. In the Bantu zone J, six derivations are typical, illustrated by the following examples from Gwere (J10) (Table 2.23).

Table 2.23: Gwere number patterns in the derivations of numerals


For the numerals '20'–'50' cl6.pl is used, and for '60'–'70' cl10.pl is used. These classes demonstrate the correlation in number with the classes cl5.sg and cl3.sg respectively. In at least four languages in zone J, the model cl3.sg > cl10.pl was encountered for '9' > '90'. In Gwere and Tembo, the model cl5 > cl6 is used in derivation '2' > '20': Gwere *ì-βíɾí* '2' > *ɑ̀ː-βíɾì* '20'.

Only one language, and that is Tembo, systematically presents model pl > pl in the derivation cl8.pl > cl6.pl (Table 2.24).

#### 2 Noun classes in the Niger-Congo numeral systems



This model is clearly secondary and was implemented as a result of re-interpretation, atypical of zone J, of classes in numerals '2–5', '7' as plural classes opposed to '1'.

The fourth theoretically possible model, that is pl > sg, has never been encountered in any derivation which can be considered indirect evidence for the idea that the pluralization of numerals of higher rank is one of the key strategies for the formation of derived numerals, as was demonstrated. Nonetheless, this strategy does not explain everything.

In order to present this elegant mechanism of systematic use of noun classes in the derivation of numerals in greater detail, an example from derivation in Soga using the roots '10' and '2' will be schematically presented. The root meaning '10' matches in Soga with six different class markers, and the root meaning '2' matches with three of them, as shown in Figure 2.1.

Figure 2.1: Soga numerals: derivations by noun classes

In the Soga language the root *kumi* takes part in three forms with singular class and three forms with plural class (one is facultative). In the derivations including forms of different numerals it is visible that the most stable correlations in number are: cl5-cl6, cl7-cl8 and cl11-cl10. However, the choice of cl7 and

#### 2.3 Noun class as a tool for the formation of numerals

cl11 for the derivations (as shown in Figure 2.1) seems to be arbitrary. According to Larry Hyman (p.c.) in the dialect Lulamoji, the archaic form of the numeral '1000' belongs to the the cl11 and not to the cl14 (Hyman: «*óBu-kumí* '1000', older usage»).

The root *βiɾi* does not take part in singular derivates but was found in three derivates where *kumi* is marked by plural class markers. The main derivate from *ì-βìɾì* '2' can function separately outside of the word combination (*ɑ̀ː-βíɾí* '20'). In this case, the main correlation in number for the class 5 is used (cl5-cl6). The difference in the class markers cl6 **mɑ-** and **ɑː-** (in some dialects **ga-**) is related to the characteristics of the paradigms of agreement markers. A question about the nature of *ì-βíɾì* in '2000' emerges. Does it belong to cl5 or is this an homonymous form of the agreement marker in cl10? These questions are very hard to answer because we are dealing with derivational forms of class markers (often homonymous) and we cannot check on the context of agreement in order to test it.

In fact, the number of classes in numerals (both singular and plural) can be even bigger. In Soga, a singular form of '8' *mù-nɑ́ː-nɑ̀* (cl3) is always formed from the numeral '4' *í-nɑ̀* (cl5). In Mpumpong (Bantu, A80), the system of numerals includes four different plural noun classes, that is cl8 for units - *tɛ̂n nɛ̀ ì-nâ* '9' (5+4), cl6 – for dozens – *mɛ̀-kàm mɛ̀-mbá* '20'(10\*2), cl4 for hundreds – *mì-tsȅt mì-mbá* '200' (100\*2), and cl2 for thousands – *ò-tɔ́sìn ò-bá* '2000' (1000\*2).

The model of formation that was masterly developed by Soga has major relevance not only for the history of numerals in Niger-Congo, but for the theoretical analysis of the semantics of noun classes as well. The signifier of morphemes in noun class paradigms has a multilayer structure. This structure presumes that the semantics of each class can be defined through the paradigm at the intersection of four parameters: classificational, paradigmatic, syntagmatic and modal (for a more detailed discussion see Pozdniakov 2003). It is useless to discuss the classificational aspect of noun class semantics in Soga numerals as we do when classes for humans, trees or animals are taken in consideration. The paradigmatic aspect of the signifier of the signs is the most relevant because the primary role is given to the correlation of classes in number, while some other paradigmatic correlations remain important as well.

In conclusion, it should be noted that the noun class switch as a derivation mechanism is not limited to Benue-Congo and can be reconstructed at the Proto-Niger-Congo level in at least one case (see Chapter 5).

### **3.1 Issues pertaining to the detection of alignments by analogy**

In addition to the grouping of numbers by noun class, a number of more radical strategies are used in the Niger-Congo languages. One of them is the formal alignment of numbers resulting from the diachronic alignment of forms by analogy. This strategy implies irregular phonetic changes in lexical stems. As a result, contiguous numerals in the Niger-Congo languages often have similar forms, that is they have common phonetic element(s).

Such cases are not easily distinguishable from phonetic similarities conditioned by morphological changes, when affixes that are no longer productive blend into lexical roots, for instance, or archaic noun class markers exist in the numerals. Thus, in Wolof, as shown in the introduction, phonetic similarities arise in the numerals '2'–'4' (*ñaar* '2', *ñett* '3', *ñeent* '4') as a result of inclusion of the noun class marker Ñ in the lexical roots.

Only specialists of a concrete language can distinguish between morphological "accidents" and phonetic analogical changes, but sometimes even synchronic competence may not be enough. Table 3.1 shows the first six numerals in five Adamawa languages.


Table 3.1: Adamawa examples

In Tunya (1) it is clear that the initial **a-** in the numerals '2'-'5' etymologically has the nature of the noun class marker. In Vere (2) the final syllable **-ko** can

hardly be considered a noun class marker, but it is very likely that we are dealing with a morpheme and not with a phonetic alignment of numerals. In Mom Jango (3) the final **-z** in '1'-'4' and '6' is difficult to comment on; it is likely that this is an analogical change but its direction is not very clear. In Dirrim (4) *bara-taranara* is the case of analogical change and, considering the diachronic context, the numerals '2' and '4' were clustered together with '3'. In Pere, the final -o in '2'-'5' may represent an analogical alignment or a morpheme.

Let us exclude all the cases of integration of noun class markers into stems and consider all the other cases of phonetic (or hidden morphological) clustering in the systems of numerals in Niger-Congo. We will deal mainly with two questions:


The topic of the present chapter is not relevant to all the branches of Niger-Congo. For instance, in Bantu and Benue-Congo there is no systematic analogic phonetic alignment. But in some other branches it is impossible to discuss the etymology of numerals without considering this factor. In the twelve main branches of Niger-Congo the situation is as shown in Table 3.2.

In the first three branches the minus does not mean that there is no phonetic alignment of numerals. Some examples from Benue-Congo languages are given in Table 3.3.

Each of these examples is interesting for the study of concrete languages, but these seem to be the only languages, among hundreds of BC languages, where analogical changes have been found; therefore, no systematic changes of this type for the BC family have been attested.

In Mel there is only one case which is of interest to us, that is the unification of the initial root consonant in Krim: *yi-gin* '2', *yi-ga* '3'. The direction of analogical alignment in this case is not clear. It is impossible to study this particular case here, because the discussion of possible hypothesis would require a separate publication. It is important to underline that in other Mel languages cases of phonetic alignment of numerals have not been attested.

There are virtually no unifications of this type in Kru, excluding the phonetic alignment of the initial consonant in '4'-'5', reported in Table 3.4.


Table 3.2: Analogic alignment in NC numerals

Table 3.3: BC examples of analogic alignments


Table 3.4: Kru alignments in '4'–'5'


I will dare to assume (based on these data) that the initial consonant in '4' has undergone analogical change with the consonant in '5'. The final judgment should be done by specialists. In Ijo this type of alignment is absent.

### **3.2 Mande**

There are no systematic analogical changes in the systems of numerals in Mande languages.<sup>1</sup> Some languages like Busa, San (South-Eastern branch) and Soninke (Western branch) present exceptional cases.

In Busa, we are probably dealing with the fossilized suffix **-hõ** which can be found inside the lexical roots of '3' and '4': \**a-hõ* '3', \**si-hõ* '4', i.e. the phonetic similarity can be explained morphologically.

In San, apparently, the regular reflex of the three different consonants of protolanguage of South-Eastern Mande is **s-** (see 3.10 below). Finally, three of the contiguous numerals start with the same consonant: *so* '3', *si* '4', *soro* '5'.

Soninke represents a more complicated case, wherein the last vowel of each numeral is not distributed randomly (Table 3.5).


Table 3.5: Soninke

In '1' there is a particular vowel **-e**. "Minor" numerals ('2'-'5') have the final **-o**, and all the higher numerals ('6'-'10') – final **-u**. Following the reconstruction of Nazam Halaoui (Halaouï 1990): *fill-a* '2' (active voice) /*fill-e* '2' (passive voice) > *fill-e-nu* (pl) '2' > *fill-o* (pl) '2'. In other words, in the numerals '2–5' the vowel **-o** is interpreted by Halaouï as a phonetically conditioned allomorph of the plural morpheme **-nu**. But in the numerals '6–10' another vowel was found, not **-o**, but **-u**. Nazam Halaouï explains this in the following way: irregular final vowel **-u** initially appeared in the numeral '6' as a consequence of progressive assimilation (\**tunm-o* > *tunmu*), and then following the analogy this vowel appeared in

<sup>1</sup> I would like to thank Valentin Vydrin for a detailed discussion of the history of numerals in Mande languages.

#### 3.3 Atlantic

numerals '7'-'10'. Halaoui's hypothesis is not plausible (it presupposes a doubtful phonetic change \***e-nu** > **-o** in the numerals '2'-'5'), neither is it the only one possible.

Valentin Vydrin (2006: 171–204) shows that Soninke has two different plural suffixes, **-u/-o** and **-ni/-nu** (the allomorphs **-u** and **-o** are dialectal variants, the same is true for **-nu** and **-ni**). It is not quite clear, do we have the generic plural marker **-u** in all the numerals from '6' through '10', or whether it is the alternative plural marker **-nu** that appears in '6' and '10', while the generic plural **-u** appears in '7' through '9'. In any case, it is evident that in the right column of Table 3.5, the final **-u** is of morphological origin, rather than a result of an analogical change. The fact of the appearance of a plural marker in the numerals '6'-'10' by itself is noteworthy; these numerals should be interpreted as pluralia tanta. Interpretation of the final **-o** in '2'-'5' is much more problematic. There is a singular morpheme **-o** in Soninke, however, Vydrin's data do not clarify why it is **-o**, rather than **-e** or **-Ø**. Therefore, it can be conjectured that the final vowel of the numerals '2'-'5' result from analogical changes.

Now let us move to the branches where analogical changes are systematic. Even in these cases we will encounter different examples.

### **3.3 Atlantic**

In Table 3.6, the data on the first five numerals in ten various Joola languages will be compared.


Table 3.6: Joola

In the last group, apparently, there is no reason for the establishing phonetic alignments. In the meantime, in the first two groups such alignments are evident. In the first group the velar consonant is spread, and in the second group, the liquid consonant; furthermore, the roots are mostly related. These are classical "symptoms" of analogical change. It is clear that it is useless to etymologize the numerals without an in-depth analysis of these alignments.

Joola languages form one of the four branches of the Bak group in Atlantic. In Bijogo, there are no analogical changes in numerals. In the other two branches, these changes of various types can be found, and such changes differ from the type of changes in Joola.

In Pepel (Manjak branch) in some sources the numerals '2' and '3' have a final **-s**, in other sources they have a final **-ʈ** and in Koelle (1963[1854]) the final consonants are different, which can correspond to the situation in proto-language (Table 3.7).

Table 3.7: Pepel


In the branch that is represented by isolated languages Balant (Senegal; according to the data from Creissels & Biaye 2015) for the numerals '2' and '3' the following forms exist (Table 3.8).


Apparently the numeral '2' has undergone the analogical change following the numeral '3'. The sources on Balant Kentohe give different but also phonetically clustered forms: *-sebm* '2', *-abm* '3'.

It is important to underline that analogical changes in the three aforementioned branches of Bak languages are not historically related – these changes

are of different origin. This means that for this group, the principle of phonetic alignments of numerals is characteristic, but different types of changes by analogy co-exist. A similar situation is also typical of Northern Atlantic languages, which show other types of phonetic alignments.

In Wolof, as previously mentioned, the alignment of the initial consonant in numerals '2'-'4' is of a morphological nature; these numerals maintain traces of the noun class prefix. Still, for native speakers these forms contain a similar phonetic marker that groups together the numerals for '2'-'4' and distinguishes them from other numerals.

In Sereer (Northern Atlantic), as in Joola (Bak Atlantic) the final velar can be clearly seen in the numerals '2'-'5': ƭ**ik** '2', tad**ik** '3', nah**ik** '4', ƥet**ik** '5'. Here the clustering involves not only the final consonant but the precedent vowel as well, which creates an illusion of the existence of a specific morpheme ('suffix' **-ik**) used for marking the numerals '2'-'5'. As will be demonstrated later, this is a false intuition. In Sereer, for example, we deal with morphophonology and not with morphology. Moreover, the coincidence with Joola is not casual and reflects an important phonetic innovation which took place in Proto-Atlantic.

In Nyun (the branch Nyun-Buy, Northern Atlantic languages) form clustering occurs through the final velar **-k** as well: -**n**du**k** '1', -**n**a**k** '2', -re-**n**e**k** '4'. It is worth highlighting that the initial consonant of the aforementioned forms is also unified (**n**-).

The same isogloss can be encountered, although in its shorter version; in one of the five languages of the Cangin branch, that is in Palor, ka-na**k** '2, ke-je**k** '3'. For Cangin this alignment is definitely marginal, in all the languages of Cangin branch another analogical change is encountered: the initial consonant in the numerals '1'-'2' is unified, which is a rare phenomenon. In Proto-Cangin we have *\*ji- noʔ* '1', *\*ka-nak* '2' with the maintenance of the initial **n-** in all five languages (compare with the unifications in Nyun).

The final **-n** is the basis for phonetic alignment in Sua, though the affiliation to Atlantic languages has not been proven: sɔ**n** '1', m-ce**n** '2', b-rar '3', m-na**n** '4', sugu**n** '5'.

### **3.4 Kwa**

54 out of the 111 sources for Kwa languages available in our database show a common initial consonant **n-** for the numerals '4' and '5'. For example, in Nzema: *na* '4', *nu* '5'. In the other half of the sources forms with **n-** can be found for '4' and with initial **t-** for '5'; for example, in Gbe-Fon: *e-ne* '4', *a-ton* '5'. The latter

forms correspond to Proto-Bantu numerals: *\*nàì* '4', *\*táànò* '5'. The question then arises: where do the forms for '5' with initial **n-** come from?

Mary Esther Kropp Dakubu (Kropp Dakubu 2012) includes the forms of the numeral '4' in the series of correspondences which go back to **\*n-** and reflect as **n-** in all of the main branches of the family except for Ga-Dangme (GD): Proto-Potou-Tano \**-nã*, Tano \*-*nã*, GTM (Ghana–Togo Mountain) \**-inâ*, Gbe *e-ne*. The author includes the numeral '5' in the series 15b where Akan and GD both have **n**-, in Gbe **t**-, and inside GTM are both **t**- and **n**- (Na-Togo). Mary Esther Kropp Dakubu suggests the following historical interpretation of these forms:

The fact that GTM is reconstructed with \***t**-, but its NA sub-group with \***n**, suggests that the **n** of Akan and GD are also secondary, and that these forms are to be reconstructed as beginning in Kwa **\*t** (ibid., p.24).

All the details of complex reconstruction will not be discussed here, but this shows that modern Kwa languages come from \*PTB (Proto-Potou-Tano-Bantu). It is worth underlining that the reported reconstruction does not explain why in some of the Kwa languages the numeral '5' with initial **\*t-** has changed to **n-**. Furthermore, she does not explain why this irregular change has happened in the aforementioned languages and not in the others.

The most natural answer to the first question is that in some languages, in the numeral '5' the initial consonant has undergone analogical change with the numeral '4'. As a result, the same consonant was formed in both numerals.

In order to answer the second question, it is necessary to observe the distribution of forms of '4' and '5' in different branches of Kwa, adding up in case of necessity forms for '3' and '2'. In order to extend the analysis of Mary Esther Kropp Dakubu, the Lagoon languages will be added to her database (Table 3.9).


Table 3.9: Akan

In all the Akan languages the alignment can be observed not only in '4'-'5' but (probably morphologically) also in numerals '2'-'3' (this phenomenon cannot be

found outside this cluster). Furthermore, one of the sources clearly indicates a final velar in Abron. Table 3.10 reports data on the main languages of Central Tano.



Nearly identical forms are found in the other three branches of Tano (Table 3.11).

Table 3.11: Krobu-Ega, Western Tano, Tano Guang


<sup>2</sup>One of the sources on Nzema gives forms without an initial nasal: *sa* '3', *da* '4', *du* '5'. Let us note that even in this case the initial consonant is the same in the numerals '4' and '5'. 3 In some sources Baule numerals '2'-'5' include also a final **-n**.

<sup>4</sup>Thus, in Ahanta the alignment of initial consonants for '4'-'5' is even more clear: **nl-**.

<sup>5</sup>The roots *-na* and *-nu* (for '4' and '5' respectively) can also be found in the Guang group in Awutu, Chumburung, Guang, Kplang, Krache, Nawuri, Nchumburu, Nkonya. For the subsequent exposition it is important that in all these languages the numeral '3' includes an initial **s-**.

Among the numerous Tano languages there is just one language in our database which does not have initial **n-** in '4' and '5'. This language is Ega, which is misleadingly put in the sub-group with Krobu; its attribution to Tano is also doubtful, according to the majority of specialists. The forms of these numerals provide one more argument against this grouping.

Some other languages display unification of the initial consonant in '4'-'5' outside of the Tano group.

As for Potou, forms with the initial **n-** in both '4' and '5': *ne-ni* '4', *ne-na* '5' were found only in Mbato, see Table 3.12.



Examples from Mbato permit us to reconstruct the unification of the initial consonant in '4'-'5' in Potou-Tano. Outside of Potou-Tano this unification, following Mary Esther Kropp Dakubu, was found only in some languages of Na-Togo (GTM). The numerals in the languages of this group are represented in Table 3.13.



In languages (1–4) **n-** appears in '4'–'5' (Anii displays an utmost variant of alignment with the unification of the final consonant as well). In language (7) the most ancient proto-language initial **t-** is attested in '5', and this means that a reconstruction of *\*n-* in '5' for Proto-Na-Togo is problematic. Furthermore, in languages (5–6) there is no alignment of the forms.

3.4 Kwa

In other Kwa languages consonants in '4' and '5' differ. To be more precise, in Adjoukrou initial consonants are aligned but they are not nasals: *jar* '4', *jen* '5'. All the other forms can be grouped into four main types:


I will provide some examples followed by interpretations.

Type 1 is illustrated in (Table 3.14).


Table 3.14: n- '4', t- '5' (t- '3')

It is clear that the basic etymological forms are represented extensively. They are not confined to Potou-Tano or the Lagoon languages but they can be found in four other branches of Kwa as well.

Type 2 is illustrated in (Table 3.15).

<sup>6</sup>Harley (2005: 155) "With the exception of mɔa – 'one' and nviã – 'two', the citation forms of these numerals are derived using the expletive third person pronoun ke, which has become incorporated into the attributive numeral : ke ɛlalɛ '3' > kaalɛ, ke ɛna '4' > kɛna …".


Table 3.15: n- '4', phonetic deviations in '5'

Type 2, like Type 1, is not difficult to interpret. In the single languages the reflexes of the original consonant are maintained in '4', while in '5' \***t-** undergoes phonetic changes.

Type 3 is illustrated in (Table 3.16).

Table 3.16: t- '5', phonetic deviations in '4'


The proto-language consonant is maintained in only two languages in '5' (Ka-Togo and GTM) while the initial consonant in '4' undergoes regular phonetic change.

And finally, the most difficult type 4 is illustrated in (Table 3.17).

Here we see all the counter-examples against the hypothesis on the change \***t-** > **n-** in '5' as analogous to **n-** in '4'. The solution is to imagine that in certain languages belonging to different branches of Kwa (independently from each other), firstly, this analogical change occurred, the original \***n-**, which was the basis of the analogical change, but was then lost in the numeral '4'.

Finally, let us get back to the question raised above: why does analogical change in '5' take place in only some Kwa languages? Let us have a look at Table 3.18, where different initial root consonants in numerals '3'-'5' within different groups of Kwa are presented.

In the Kwa languages we see a clear tendency: in languages with the initial plosive \***t-** > fricative **s-**, the described analogical changes can be found. Where the plosive is maintained, this change is more difficult and can be found in only some

3.5 Adamawa


Table 3.17: n- in '5' but not in '4'

Table 3.18: Kwa initial consonants in '3'-'5'


of the languages (for example, some of the above-mentioned Na-Togo cases). In this case we have not \*t- > n- '5', but \*t- > s- > n. This observation can be interesting as a candidate for analogical changes – maybe, 'weak' consonants (for example, fricatives) can be more easily involved in analogical processes than 'strong' ones (plosives).

It is curious that this analogical isogloss can be found in a number of other branches of Niger-Congo, including Adamawa, Gur and Dogon (as well as Seenku from the Mande family).

### **3.5 Adamawa**

In Adamawa the above-mentioned analogical change can be found in at least a dozen of languages (Table 3.19).

However, in Adamawa, analogies are much more widespread than in Kwa. For instance, in Gimme the numerals '2'-'7' share the same final syllable (morpheme?). In Chamba, only one similarity can be found for '4'-'5' and for '2'-'3'


Table 3.19: Initial n- in '4'-'5' in Adamawa languages

(the final syllable **-ra**). In Kolbila, the situation is quite similar to the one in Chamba ('2'-'3' share the same final syllable **-nu**) and in '4'-'5' both the initial **n**and the final **-b** coincide.

Phonetic alignment follows more interesting models in Bangunji, Yendang, Dadiya, Peere and Samba Leko. In these languages, on the one hand, '4'-'5' are still grouped together (because of the initial consonant) and, on the other hand, ('2')-'3'-'4' are also grouped (because of the final syllable). The numerals with the meaning '4' have two simultaneously distinct features which mark two separate groupings. As a result, peculiar minimal pairs arise formed by contiguous numerals; for example, in Yendang: *tat – nat* '3'-'4', *nat – nan* '4'-'5'.

Another alignment of numerals (2), '3'-'4' takes place in Adamawa where there is no alignment in numerals '4'-'5'. Minimal pairs like in Dirrim *bara* '2' – *tara* '3' – *nara* '4' are a very widespread phenomenon for the languages within this family. Some examples are presented in Table 3.20.

This kind of assonance may seem insignificant, but I would like to underline once more that among hundreds of Benue-Congo languages, it is impossible to find any similar case.

3.6 Ubangi


Table 3.20: Adamawa analogical alignments in '3'-'4'

### **3.6 Ubangi**

Ives Moñino (1995) has reconstructed unified forms for '3'-'4' and partly for '5' in Proto-Gbaya. These forms resemble the above-mentioned "minimal pairs" in Adamawa. In Proto-Gbaya: *\*tar(a)* '3', *\*nar(a)* '4', *\*mor* '5' (notably, the numeral '5' coincides with the word 'hand'). In Ubangi-Sere, a different type of alignment can be found – the final **-o** in numerals '2'-'5' (in Ubangi-Zande – the final **-i**) (Table 3.21).


Table 3.21: Final vowel alignments in Ubangi

### **3.7 Gur**

In some languages of the Gur family analogical changes in '4'-'5' can be found, as observed in Kwa and Adamawa (Table 3.21).


Table 3.22: Gur initial n- in '4'-'5'

Like in Chamba (Adamawa), some of the Gur languages have a common feature not only for '4'-'5' but also for '2'-'3'. For instance, in Nawdm and Safaliba, as can be deduced from Table 3.22, the numerals '2'-'3' have a final velar consonant. The final velar can be found in '2'-'3' in Hanga (*a-yik* '2', *a-tak* '3'), and in Dogose it is found in '2'-'5': *i-yok* '2', *i-sak* '3', *i-yik̬* '4', *i-wak* '5'. Gudrun Miehe (Miehe et al. 2007: 157) shows in Khisa (Komono) the final -**Ɂ** in '2'-'5': *ɲɔ́ɔ̀ʔ* '2', *sá aʔ* '3', *ɲéèʔ* '4', *ŋwáàʔ* '5'.

˜ And finally I would like to report a rare case of strong alignment between the numerals '1' and '2' in Mbelime: *yɛ̃nde* '1', *yede* '2'.

3.8 Dogon

### **3.8 Dogon**

Assimilation of the initial consonant in '5' to the initial consonant **n-** in '4' (for example, Tommo So: *nay* '4', *no* '5') is characteristic of practically all the Dogon languages and should be reconstructed already for the Proto-Dogon. Other types of unification cannot be found in this family.

### **3.9 Kordofanian**

Phonetic/morphological alignments in this family are quite rare. In what follows, the most interesting cases are reported (Table 3.23).


Table 3.23: Kordofanian alignments

In Talodi the final velar is present, similarly to other branches of Niger-Congo. Some cases of phonetic alignment can be found, though this alignment is reserved to singular languages rather than to the whole family.

In sum, the data examined in this chapter can be found in Appendix C where 50 different cases of probable analogical changes in Niger-Congo are highlighted. The Table in Appendix C permits the evaluation of the scale of analogical changes in the system of numerals in Niger-Congo in general.

It is worth mentioning that in the cases where numerals '6'-'10' are not derived, it is very unusual to find phonetic alignment in them (exceptional systems, such as that of Soninke, were previously discussed). For this reason, only the numerals '1'-'5' are included in Appendix C. Three main questions are to be answered concerning these numerals: 1) Which groupings of numerals are most typical for the Niger-Congo languages when we deal with analogical changes? 2) Which phonetic (or hidden morphological) means are used to produce the alignment of

numerals? 3) Are there any reasons to consider that similar analogical changes in different branches of Niger-Congo can be diachronically related? Otherwise, can these materials be useful for the study of other isoglosses in Niger-Congo?

As demonstrated in Appendix C, mostly contiguous numerals are aligned (see some rare examples above, for example in Nyun languages, where features for '1'–'2'/'4' are shared, but not for '3').

It is quite rare that '1' shares a submorphemic marker with the numeral '2', while for other contiguous numerals this is more common. Such rare examples are found in Ha (Bantu J) and in Mbelime (Gur). In both languages the forms of numerals '1' and '2' have minimal phonetic difference. As will be demonstrated in the following sections dealing with the etymology of numerals '1' and '2', the forms in Ha (*mbele* '1', *bhili* '2') are of great interest for the diachronic interpretation of numerals.

As can be seen in Appendix C, the final phonemes have phonetic alignment much more often than the initial ones.

The appearance of the diachronically irregular initial **n-** in the numeral '5' as analogous to the regular form of the numeral '4' represents a common feature in different families of Niger-Congo: Potou-Tano (Kwa), Adamawa, Gur and Dogon. More attention should be paid to this phenomenon because it is unlikely that one analogical feature could appear in four different branches of Niger-Congo independently.

There are two remarkable cases in the alignment of final phonemes which are typical for several branches of Niger-Congo.

Firstly, there is the appearance of a final velar (**-k**) in the groupings of the numerals '2'-'5', '2'-'4', '2'-'3', '3'-'4' (in Kordofanian and Atlantic also '1'-'2'- ('3')). This feature is typical for the Atlantic, Adamawa, Gur and Kordofanian groups (thus, one more common feature can be found for Adamawa-Gur). In Benue-Congo and Mande the reported examples are clearly marginal.

Secondly, similarly to the regular dental reflexes of the final consonant in the numeral '3' (\***-t(h)**), in '4' the final consonant undergoes an irregular change (non dental consonant becomes dental). This type of change is particularly characteristic for Atlantic, Adamawa and Gbaya (Ubangi), but it is also found in Kordofanian and in Benue-Congo, which do not have analogic changes as characteristic features.

The most common case is the appearance of the identical final vowel in some languages of different families (mostly in numerals '2'-'5'): Mama (Bantoid), Soninke (Mande), Peere (Adamawa) and Ndogo, Pambia (Ubangi).

All the reported cases should be taken into consideration for the process of etymologization of numerals, which will be done in the following chapter.

## **4 Step-by-step reconstruction of numerals in the branches of Niger-Congo**

In this chapter we will try to create a step-by-step reconstruction of numeral systems for each separate family independent of the data from the other NC families. For each family we shall examine the range of basic numerals from '1' to '10' and then the numerals for '20', '100' and '1000'. We begin our overview with the largest family, Benue-Congo.

### **4.1 Benue-Congo**

There is no Benue-Congo classification that is accepted by all scholars. As noted, the inventory of Benue-Congo groups mainly follows the classification of Kay Williamson (1989b: 266–269). We repeat here the scheme of BC given above, in the introduction as Table 4.1.


Table 4.1: Benue-Congo languages

Let us begin our overview with the largest group of Bantoid languages.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

### **4.1.1 The Bantoid languages (including Bantu)**

The reconstruction of numerals in the Bantoid languages is based on 140 sources for the major branches of this family. What follows is the result of our step-bystep analysis of numeral systems in these languages.

#### **4.1.1.1 'One'**

We shall collect the main forms for '1' in different branches of the Bantoid languages. The last column of Table 4.2. shows some isolated forms for '1' which seem to be innovations.

At first glance, the terms for '1' in the majority of the Bantoid languages appear to be quite homogeneous, their roots being traceable to either *\*moʔ* or *\*moi/mwi* of uncertain etymology. The misleading similarity of the Bantu roots *mòì, mòdì, mòtí* may be due to the merger of the noun class prefix **\*mʊ̀-** with the nominal base.<sup>1</sup> This hypothesis (developed in detail in Vanhoudt 1994) has now found its way into the BLR (cf. BLR3 *sub mòdì* (NC): '*plutôt mʊ̀-òdì: voir Vanhoudt 1994* ').

Among other common Bantu forms are *mócà* (zones KN), *mòtí* (ABCEGHKLRS) <*\*mʊ̀-òtì*, *mʊ́égá* (zones BH) (BLR3: *mòì* + suffix), and *mòì* (ABCDEFGJKLMRS). As will be shown below, the presence of a nasal prefix in the Bantoid numerals is suggested by the distribution of these forms in Benue-Congo. Those BC branches that have nasalless roots within the nominal classes 'one' and 'three' lack the terms for 'one' with a nasal consonant.

This interpretation, however, does not address two major issues, namely 1) whether the forms in question (e.g. *\* -òdì/ -oti/ -oʔi*<sup>2</sup> ) consist of one or more roots and 2) whether the open back vowel belongs to the root.

A solution to the former problem may turn out to depend on how the latter is treated.

Within the context of Niger-Congo, it is conceivable that the Proto-Bantu *òdì* may go back to \**ò-dì*, with **\*ò-** being a marker of the NC noun class 1 (**\*ko-/ ʔo**according to my reconstruction). This hypothesis will receive a more detailed treatment in the next chapter. At this point, we will only note that it is quite problematic to explain the common reflexes of **\*-di, \*ti,** and **\*ʔ-** in Bantu within this hypothesis. Moreover, the etymological relationship between these roots (disregarding *\*di* and *mɔ(m)* (Tivoid), *ó-mè* (Mbe), *ma* (Mamfe), etc.) would be much less transparent than that in case of *modi ~ moti* or even *-odi ~ -oti*.

<sup>1</sup> I agree with Larry Hyman who reacted to this point: "This would suggest that '1' was a noun; possible, just like '10', but note that '2'–'5' are not nouns!" (p.c.).

<sup>2</sup>Larry Hyman: "The glottal stop goes back to a velar in Grassfields; it could be either alveolar or velar in Tikar".( p.c.).

#### 4.1 Benue-Congo


Table 4.2: Bantoid stems for '1'

*<sup>a</sup>*The Fam and Tiba (Fà) forms are quoted according to Blench (n.d.[b])) and Boyd (1999) respectively. The online version of Boyd (https://hal.archives-ouvertes.fr/hal-00323718v3) differs from the printed one.

*<sup>b</sup>*An asterisk (\*) in the second column of the tables (here and below) means that in the corresponding line all the forms are reconstructed. However, with the exception of the Proto-Bantu line, which indicates real reconstructions in BLR3 (\*), all other reconstructions are hypothetical (#) and reflect the most typical form/forms attested in a particular branch of Benue-Congo. Forms that may be related are grouped in tables within the columns. The last column of the tables shows isolated forms that are likely to be innovations.

*<sup>c</sup>*Concerning the form *yet* in Ekoid, I quote a precious remark of John Watters (p.c.): "The actual root for Proto-Ekoid may be **-t ~-d**. The /aŋ/ in some Ekoid languages may be an accretion. The *yét* morphologically is /yé-t/ with the CV being a class agreement prefix, and **-t** being the root. So the **-t** may be closer to the Bantu *moti*. I'm not sure how *ó-mè* in Mbe figures in with the rest of Ekoid, but one possibility is that the **-mè** root derives from /me-t/. Ekoid needs further work".

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

The secondary PB form *\*ókó* (zones ABCHF) (BLR3: "Janssens 1994: alternance C1 p/m/b-ókó- protoforme secondaire, cf. 'seul'") is comparable to *\*baka* (Beboid: Fio *mbákâ ~ nbáhá*, Nchane (Mungong) *m⁴ba³ka⁴*). It should be noted that the above considerations allow us to explain the initial consonant (and the following back vowel) in these forms as noun class morphemes, too.

The Northern Bantoid *kin/cin* is remarkable and will be addressed later in this chapter.

The Bamileke *\*tʃu* (Fefe *ʃɯʔ*, Medumba a*ntʃʊʔ*, Nda'nda' *ŋtʃɔ̀ʔ*, etc.) is possibly related to the Bantu *\*tʊ* (BCDEGLP) 'alone, empty, vain'.

#### **4.1.1.2 'Two' and 'Three'**

Without exception, the reconstructed root for 'two' in all Bantoid branches has an initial labial consonant, either voiced (b-) or voiceless (p-/f-). A more precise reconstruction of the proto-form is beyond my cognizance. The forms cited above do not permit a conclusion with regard to the number of roots involved (one or two). When comparing the most commonly attested forms *\*pa/ fe* and *\*baa*, it is necessary to keep in mind that at least the Proto-Bantu *\*bàdɩ/bɩ ́ dɩ ̀ ́*could be a reflex of *\*di*. In the case of **ba**- the proto-form should be interpreted as a prefix of a plural noun class (possibly class 2).<sup>3</sup> The latter proposal finds support in the dialectal Proto-Bantu form *jòdè* (zones BH) (<*\*jò-dè*?). The main forms show the following zonal distribution: *bàdɩ́*(ABCHKLR), *bɩdɩ ̀ ́*(CDEFGJKLMNPS), *bɩdɩ ́ ̀*(?).

It was repeatedly stressed that the root for 'three' (\**tat*) is one of the most stable in NC and in the Bantoid languages in particular. Phonetic variation within this root will be studied in Chapter 5.

#### **4.1.1.3 'Four' and 'Five'**

The well-known NC root *\*nai* 'four' is represented in all of the pertinent languages. The only exception is Grassfields, where it was replaced with the innovative *\*kwa/kya*. According to Roger Blench, Momo *-kpi* and Ring *kaìkò* as well as the Proto-Eastern Grassfields *\*-kùa* go back to the Proto-Benue-Congo *#-kpà(ko)* (Blench 2004: #387). This root, however, is commonly found in Mbam-Nkam, i.e. in all Grassfields languages, and is barely attested outside this branch.

<sup>3</sup> John Watters: "This analysis, if correct, could work also for most of Bantoid. So Ekoid would derive from **ba**- prefix and **-l ~ -d ~ -n** root. However, the /b/ may derive from /p/. Ekoid may derive from *\*-pal* and then you have the many other Bantoid languages with /p/" (p.c.).


Table 4.3: Bantoid stems for '2' and '3'

The root for 'five' is almost invariably *\*tan*. One possible exception is the Ekoid form, unless *\*don/ron/lon* (Ekajuk *nlɔn*, Ejagham *érôn*, Nkem-Nkum *írôṉ* ) is a reflex of *\*tan*).

It should be noted that the Ndemli root *itʃìjè* may be related to *kwV* in the Grassfields languages. As we hope to demonstrate below, this is probably not a coincidence.


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### **4.1.1.4 'Six'**

The Grassfields languages show a common root *\*toʔo.* Outside Grassfields, it is attested only in Ndemli (just like the Grassfields root for 'five') and thus can hardly be reconstructed for Proto-Bantoid. However, we cannot exclude this, if PB \**tʊ́ʊ́bá*'6' attested in zones ABCD is related to the Grassfields forms.

<sup>4</sup> John Watters: the Proto-Ekoid probably is \*-ron (p.c.).


Table 4.5: Bantoid stems and patterns for '6'

As in some other NC branches, three patterns that can be used to derive '6' from '3' are attested in the Bantoid languages (the following observations are even more relevant in the case of the patterns for 'eight' based on 'four'):

1. The change of a class prefix (or its addition): Ajumbu *tò* '3' > *kʲà-tò* '6'; this pattern is possibly attested in Tutomb (Mbam) *pɛ́-dààt* '3' > *pí-tʃín-dìt* '6', Elip *bʊ́-dád̥*'3' > *bʊ́-thín-dàd̥*'6' (this pattern is marked '3PL' in the table above). To strengthen the etymology for 'six' in Tutomb, it should be noted

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

that in Tunen (another Mbam language) that has *\*tat* '3' > *lal* (*bɛ́-lálɔ́*), the term for 'six' also contains [l]: *pɛ́-lɛ́ⁿdálɔ.*


The Kenyang (Mamfe) form *bɛ́-tándât* '6' (cf. *bɛ́-rát* '3') deserves special discussion. This form is reminiscent of the common Bantu form *tándà* '6' attested in zones DGM. Its extended variant *tándàtʊ́*is found in EFGJS, while the GNS zones use the form *tántàtʊ́* which is even more interesting. Are the Bantu *tándà* forms cited above based on '3'? If so, *\*tat-tat* > *tatat* (*tántàtʊ́*) in the languages to which Dahl's law is applicable as well (> *tandat, tanda*).

In this case, the form *tʊ́ʊ́bá* (zones ABCD) that can be interpreted as '\*3\*2': \**tat-X-ba* may also be a derivative form.

If so, the aforementioned Bantu forms (as well as the Kenyang form) are probably not innovations. They may reflect a Proto-Bantoid model where 'six' is based on 'three'. It should be noted that a close parallel to the Kenyang form is attested in the Mbam branch: Nomaande *be-tíndétú* '6'.

In sum, it appears that the most probable word-formation pattern for 'six' in Proto-Bantoid is '3+3' or '3PL'.

#### **4.1.1.5 'Seven'**

The case of 'seven' seems pretty straightforward. In the majority of the Bantoid branches (including Bantu) the root is *\*samba/camba*. However, there is still a question whether this root is indeed primary: its Bantu reflex is strikingly similar to the root for 'six'. Table 4.7 shows some selected examples.

It is noteworthy that the terms for 'six' and 'seven' show similarity not only in case of the root in question, but in case of other roots as well, e.g. J50: Fuliiru  *lindátù* '6'~ -*linda* '7', Shi *ńdarhu* '6'~ *ńda* '7'. This similarity is usually conditioned by one of the following factors:

• the terms for 'six' and 'seven' follow the patterns '10–4' and '10–3' respectively: Yeyi (Bantu R40) *vùndʒà ɛ́nɛ́ɛ́* '6' ('10' 'break' '4 (fingers)'), *vùndʒà ɛ́táâːtō* '7' ('10' 'break' '3 (fingers)'. This, however, is very rarely attested.



Table 4.7: Similarities between '6' and '7' in Bantu


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo




Table 4.9: '6' and '7' from '5' in Bantu


Staying within the Bantoid family, it is difficult to say which of these explanations should be applied in the present case. If it is alignment by analogy, we should reconstruct a Proto-Bantoid primary root \**samba/camba* for 'seven' and then explain the many irregular shifts in the forms of 'six' (e.g. t > s) by analogy with this root (as shown above, the Proto-Bantu 'six' is based on 'three' (\*tat)). We may also be dealing with a derived proto-form *\*sam-ba/cam-ba* with the second element probably going back to 'two'.

### **4.1.1.6 'Eight'**

Both Grassfields and Ndemli share the common primary root for 'nine' (*\*famV*). We have already seen this distribution, which only suggests that Ndemli belongs to the Grassfields branch (at least on the basis of their numeral systems). The majority of other branches point to the reconstruction of the term for 'eight' as


Table 4.10: Bantoid stems and patterns for '8'

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

based on 'four' (either by means of reduplication or by the noun class switch, or both).

### **4.1.1.7 'Nine'**


### Table 4.11: Bantoid stems and patterns for '9'

4.1 Benue-Congo

It seems likely that there was a primary root for 'nine' in Proto-Bantoid. It can be tentatively reconstructed as *\*bukV*. 5 In Bantu, this root is found in the ABCDHL zones. The most common pattern '5+4' (as well as the less frequently attested '10–1') often develops independently in various languages. A marginal pattern '8+1', attested in Mamfe, Mbam and Tivoid is noteworthy. Because of its rarity, it is relevant for the genetic classification of the Bantu languages, since it is hard to imagine that this form developed independently in each of these branches. The last column of the table below lists bases that are exclusively found in a specific Bantoid branch.

#### **4.1.1.8 'Ten'**

At least two Bantoid roots (*\*fu* and *\*kum/ kam*) may be useful for our reconstruction purposes. Both of them are attested in no fewer than six of the Bantoid branches (note also the Chamba-Daka *kúūm* 'nine'). The Mambiloid languages show the greatest variety of roots.

It should be noted that a separate Proto-Bantoid form for 'ten' is not traceable in some of the pertinent languages. Despite this, it has been preserved as a part of the term for 'twenty', e.g. 'ten' is attested as *é-pɔ́ːt* in Ipulo (Tivoid). This form is probably related to Tiv *púè/ púwè* and Lyive e*pùɛ̀* and may be attested in the Mbam branch as well (Nubaca *mwa-pwat* 'ten', etc.).

It is clear, however, that the Ipulo 'twenty' (*i-ham*) is derived from the Proto-Bantoid term for 'ten' by means of a noun class switch. The same can be applied to Bhele (D30): *mɔkɔ́* 'ten' but *e-kómi í-ɓalé* '20' (*í-ɓalé* 'two'). The root *kam* will be discussed below in connection to the terms for 'hundred'.

<sup>5</sup> John Watters: "Given the distribution of these forms for 'nine' I would conclude that Proto-Bantoid likely used 5+4 and that *\*bukV* was an innovation in the pre-Bantu era when Proto-Bantu had not yet separated from what became Grassfields and other closely located Bantoid groups".

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


#### Table 4.12: Bantoid stems for '10'

4.1 Benue-Congo

#### **4.1.1.9 'Twenty'**

It is not necessary to quote the forms for 'twenty', since in the majority of the Bantoid branches (including Bantu) this term is based on 'ten' and follows the pattern '10\*2'. Some minor but peculiar variations should be noted here, but all of them are of little significance for our reconstruction. E.g. the term for 'twenty' often employs the plural noun class with the two components in agreement. However, non-compound forms based on 'ten' or 'two' in the plural are also attested. For instance, in one of the Bafut dialects *báà* 'two', *tà-wûm / nɨ-wûm ̀* 'ten' > *mɨ-̀ wúm mí-mbáà* 'twenty', while *tà-ɡhûm* 'ten' ~ *mɨ-ɡhum ̀* 'twenty' in another. At the same time, Limbum *báː* 'two' ~ *m* ˙ *-báː* 'twenty'. These patterns (especially the former) are common in the majority of the Bantu languages as well.

Primary roots for 'twenty' are rarely attested. They may go back to the lexical base 'man' (e.g. in D30 Komo *nkpá búi* 'twenty' = 'whole person'), 'head' (Suga (Mambiloid)) *ɓʉʉ bíb* 'twenty' <*ɓʉʉ* 'head') or some other lexical bases (e.g. Bantu A50: Bafia *ɨ-tín ̀* /*mʌ̀-tín* 'twenty' <'score').<sup>6</sup>

#### **4.1.1.10 'Hundred' and 'thousand'**

It appears that the term for 'hundred' cannot be reconstructed for Proto-Bantoid: in most of the branches the pattern employed is '20\*5',<sup>7</sup> whereas in some of the branches the term is borrowed. Both Grassfields and Bantu show innovations. The Grassfields root may be tentatively reconstructed as *\*ku*. Several roots are known for Bantu, their use being limited to certain zones: *kámá* ABCDHL, *gànà* DEFGJNPS, *tʊa* DL, *jànda* MNP. None of these roots is attested with this meaning elsewhere in the Bantoid languages, except for Bantu. The similarity of *kámá* with the root reconstructed for 'ten' is noteworthy. Moreover, it is attested with the meaning 'thousand' in at least three of the Bantoid branches as the table below shows (Table 4.14).

The root *kam* allows multiple interpretations. We will return to it after the evidence from other Benue-Congo branches has been examined.

<sup>6</sup> John Watters: "The Bakor group of Ekoid attest something like *\*-tên* and Mbe has *-têl*. The other two Ekoid groups have a form *-rim* or *-sam*. I would reconstruct for Proto-Ekoid *\*-têl* or *\*-tên* which is like Bantu Bafia. They are a few hundred kilometers apart with many languages and a significant mountain range in between, so this is not borrowing" (p.c.).

<sup>7</sup> John Watters: "The distribution of this form is suggestive of an older vigesimal system for Bantoid rather than a decimal one. I would take the decimal ones as innovations" (p.c.).


Table 4.13: Bantoid stems for '100'


#### Table 4.14: Bantoid stems for '1000'

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

The Proto-Bantoid numeral system can be reconstructed as in Table 4.15.


Table 4.15: Proto-Bantoid numeral system<sup>8</sup>

According to Kay Williamson, the base for 'one' in Benue-Congo should be reconstructed as *#-kani*. The only form quoted in support of this hypothesis in her first article (Williamson 1989b: 255) is a supposed Bantoid reflex of the root in Tiba (*a-kina* '1'). Later (Williamson 1992: 396) she adduced one more Bantoid form, a Southern Bantoid Esimbi term *keni* '1'. That Williamson gives too much weight to these two marginal Bantoid forms is evident from the fact that she reconstructs this base not only for Benue-Congo, but for Niger-Congo as well. This leads her to the idea (probably expressed in the latter work for the first time) that Niger-Congo originally roots had a triconsonantal structure, hence her reconstruction of the proto-form for 'one' as *\*\*-'kə'gəni*. This Niger-Congo etymology will be studied in detail below. At this point we will only note that the Esimbi form cited above is strikingly unusual for the Bantoid languages and was probably misinterpreted. The form *kēnə̄*'1' is indeed attested in some of the Esimbi sources (see Brad Koenig, https://mpi-lingweb.shh.mpg.de/numeral/Esimbi. htm). However, in other sources the form *ɔ-nə* is attested (Cristin Kalinowski in (Chan)), so the term for 'eleven' is *bùɣù nə-nə* (*bùɣù* '10'). In other words, the base for 'one' in Esimbi is *-ni/-nə̄*(!), while the first syllable should be interpreted as the noun class prefix, just as in other numerals (cf. the forms *mə̄rākpə̄*'2', *mōɲī* '4', *mātə̄nə̀* '5', etc. in Koenig).

As for Tiba, it is still not certain whether this language indeed belongs to the Bantoid group (cf. Boyd 1999, where Tiba is considered an Adamawa language). The only Bantoid forms that could have been used by Williamson in support of her hypothesis are found in some of the Northern Mambiloid languages, cf. Twendi (Cambap) *tʃínī*, Mambila *tʃɛ́n* (with palatalization assumed). However,

<sup>8</sup>My competence does not allow me to reconstruct the tones in the numeral Bantoid languages, especially in Benue-Congo.

4.1 Benue-Congo

these forms are extremely marginal as well, so they cannot give ground for the proto-language reconstruction (in any case, not for Proto-Bantoid).

### **4.1.2 Benue-Congo (the Bantoid languages excluded)**

After the numerals of the Bantoid languages, let's consider the numerals in each of the other groups within this vast family, namely Cross, Defoid, Edoid, Idomoid, Igboid, Jukunoid, Kainji, Platoid, Nupoid (Sections 4.1.2.1–4.1.2.9) and in some isolated BC languages – Ikaan, Akpes, Oko and Lufu (Sections 4.1.3.1–4.1.3.4). After this, we will generalize the results obtained in order to try to reconstruct the numerals of Proto-BC (§4.1.4).

#### **4.1.2.1 Cross**

Let us consider the typical stems for numerals in the Cross languages.


Table 4.16: Cross stems for '1'

Let us dwell on this table, using it as an example for understanding the majority of the subsequent tables given in this book. Almost every table represents the synthesis of the primary data. We cannot publish all of these primary forms. Let's make an exception. In order to make clear to the reader on what basis the generalizations were made, we present in Appendix D all the forms available for the numerals '1' in the Cross languages, including intermediate Proto-Upper

<sup>9</sup>Here and below, index D introduces the reconstruction proposed by Dimmendaal (1978).

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Cross and Proto-Lower Cross reconstructions, proposed by Dimmendaal (1978) and Connell (1991). From the Appendix D, it is clear that Connell accepts the Dimmentaal hypothesis, according to which in Upper Cross *\*gʷá-* is interpreted as a prefix, and the lexical stem is represented by *\*-ni*, attested also in Central Delta-Cross and Ogoni. Based on the 60 sources listed in Appendix D, in table 3.15 for the numeral '1', the root *ni(n)* is allocated. The table also identifies the second root for '1', also possibly represented in the three branches of their five. Connell reconstructs it as *\*cèèd*, but the data from various Lower Delta-Cross, as well as from Dendi, suggests that perhaps we are dealing with a palatalization of the velar before the front vowel: *\*ked / ket / kin* > *ced / cin* (unfortunately, for most groups of the Niger-Congo, including Cross, we do not have sufficient grounds for reconstructing the tones). Finally, the third root presented in Icheve *à-mɔɔ* is probably related to Bantu.

**'Two' (Table 4.17)**


The roots *\*bae* and *\*po/pa* are noteworthy.

#### 4.1 Benue-Congo

**'Three' and 'Four' (Table 4.18)** The common Niger-Congo roots are attested for these numerals in all of the branches (\**ta(t)/ ca(t)* and *\*na(n)* respectively).


Table 4.18: Cross stems for '3' and '4'

**'Five' (Table 4.19)** Two roots can be postulated for Cross, namely *\*tan* and its alternative, tentatively described as *\*gbo(k).*



#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

**'Six' to 'Nine' (Table 4.20)** At this stage it seems reasonable to maintain the forms and patterns represented in the last line of the table.


Table 4.20: Cross stems and patterns for '6'-'9'

**'Ten', 'Twenty', and 'Hundred' (Table 4.21)** It should be noted that providing a detailed reconstruction for each of the Cross numerals lies beyond the scope of the present investigation, so there is probably no point in trying to establish which of the roots for 'ten' (*\*kpo* or *\*job* ) should be reconstructed in the Proto-Cross (especially impossible without external evidence).

The Cross languages are highly divergent in regard to numerals (an exception should be made for 'three' and 'four' which are remarkably stable in Cross, as well as in the other NC branches). However, the forms cited above do not provide sufficient reason to suggest a closer relationship within any randomly selected pair of the Cross branches. Hence, it would be too daring to interpret the roots attested in both of these branches as shared innovations. Let us count the numbers of related numeral forms in different pairs of the Cross branches (Table 4.22).

This distribution is remarkable with regard to the total absence of shared forms (with the 'three' and 'four' excluded) between Bendi and Central Cross. Keeping this in mind, all of the established alternative roots and patterns can be reserved for a later discussion. At this point the following reconstruction of the Proto-Cross numerals can be suggested (Table 4.23).


Table 4.21: Cross stems and patterns for '10', '20' and '100'

Table 4.22: Number of related numerals in different pairs of the Cross branches




#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### **4.1.2.2 Defoid**

The Defoid branch is relatively compact: it is composed of four languages including Yoruba and its dialects. Historical phonetics of these languages should be considered for a proper reconstruction of the Defoid numeral system, because most of the terms show great phonetic variety. E.g. for 'four' several forms are attested: *-nɛ* (Ariɡidi), *-jē̃*(Ayere), *-rin/-hɛ̃/-ɛ̃*(Yoruba), *-lɛ̀* (Igala). The main forms are given in Table 4.24, and their reconstruction will be discussed below.


Table 4.24: Defoid numerals

Following the Proto-Yoruba-Igala reconstruction (Pozdniakov, ms), the terms *\*lɛ(n)* '4', *\*lú(n*) '5' and *\*sá(n)* '9' are reconstructed on the basis of the following regular phonetic correspondences (Table 4.25).

These examples illustrate the phonetic correspondences coming from \*l '(Table 4.26).


Table 4.25: Fragment of the Yoruba-Igala phonetic reconstruction

Table 4.26: \*L-stems in Proto-Yoruba-Igala and their regular reflexes


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Yoruba [s] is correspondent to Igala [r] (<\*ʃ) or [l] (<\*s) in at least six examples, see Table 4.27 below.


Table 4.27: Reflexes of \*ʃ and \*s in Yoruba-Igala

The reconstruction of the term for 'seven' (*\*byē*) is based on the following correspondences (Table 4.28).

Table 4.28: One more fragment of the Yoruba-Igala regular correspondences


The reflexes of **\*by-** can be represented as follows (Table 4.29).

Table 4.29: Reflexes of \*by in Yoruba-Igala


#### 4.1 Benue-Congo

Finally, the terms *\*gwá* '10' and*\*gwú(n)* '20' are reconstructed in view of **\*gw** > Yoruba **w** (before [a])/**g** (before [u]) ~ Igala **gw** (Table 4.30).


Table 4.30: Reflexes of \*gw in Yoruba-Igala

These correspondences are treated here in detail because they may be of special interest for the comparative study of the Defoid languages.

#### **4.1.2.3 Edoid**

The following reconstruction is based on nearly forty sources which represent twenty languages within this group. The reconstruction proposed by Elugbe was also considered.

Being no specialist in the comparative study of the Edoid languages (unlike Elugbe), I do not feel competent enough to criticize his ideas. Elugbe likely had his reasons for reconstructing the same consonant (**\*ch**-) in the terms for 'three', 'five', 'six', and 'seven'. Indeed, the comparison of data from the four Edoid branches confirms that the terms for 'three' and 'five' (but not for 'seven') have the same initial consonant. This is common for many of the NC branches (and probably for the Proto-NC as well).

In view of this, I would like to suggest a simplified reconstruction that is closer, in my opinion, to the actually attested forms (Table 4.31).

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.31: Edoid numeral systems and Proto-Edoid

#### **4.1.2.4 Idomoid**

The roots attested in about ten of the Idomoid languages are represented in Table 4.32.



*<sup>a</sup>*Please note that hypothetically related forms are separated by a slash (/), whereas unrelated ones are separated by a comma.

4.1 Benue-Congo

It should be noted that the data on the Yatye-Akpa branch (one of the two Idomoid branches) is systematically absent. The analysis is based on the Akweya languages only, so unexpected issues may arise.

#### **4.1.2.5 Igboid**

This is a small group consisting of several languages. The forms which could be found in modern Igboid languages are listed in Table 4.33.


Table 4.33: Igboid numerals

Interestingly, the terms for 'one' attested in the Igboid languages (as found in Koelle 1963[1854]) are subject to significant variation. The following forms are noteworthy: '1' – Īsóāma *oo-te*, Íṣiēle *mfuu*, Ábādṣa *na*, Aro *mbɔ*, Mbó ¯ fīa *mpoŋ* (the transcription of the forms and languages follows Koelle). The rest of the numerals quoted by Koelle are essentially the same as the ones found in Table 4.34.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### **4.1.2.6 Jukunoid**


Table 4.34: Jukunoid numerals

Tentative reconstructions for the three major branches of this relatively small family are presented in the table above. The terms for 'one' and 'ten' vary significantly.

#### **4.1.2.7 Kainji**

The comparative analysis of the Kainji group is hindered by the fact that there is no linguistic description for the majority of its languages. However, there is a great range in numerical terms within those languages, for which reliable data is available. The following analysis is based on thirty pertinent sources, including the comparative list of forms compiled by Dettweiler & Dettweiler (1993). What follows is a step-by-step analysis of the available data that will hopefully yield some answers.

### 4.1.2.7.1 'One'


Table 4.35: Kainji stems for '1'

The grouping principles for the forms included in this table are admittedly haphazard. On the one hand, the relationship between some of the forms arranged into the same column (e.g. *hĩn, tʃɘ̄ːn* and *dɛn* or *dínkā* and*\*lu-ruŋ*) is not immedi-

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

ately apparent. On the other hand, some of the forms placed in separate columns might be etymologically related (e.g. *dɨɨn* Giro and *dínkā* Iguta). In these circumstances it seems reasonable to go back to the reconstruction of the Kainji term for 'one' on the basis of the data provided by other Benue-Congo branches (see §4.1.4).

#### 4.1.2.7.2 'Two'

The above considerations regarding the term for 'one' are applicable to the term for 'two' as well. The inventory of forms found in Table 4.36 is neither helpful


Table 4.36: Kainji stems for '2'

for the reconstruction of the Proto-Kainji term for 'two', nor suggestive of the morphemic analysis of the pertinent forms within each of the branches. As we hope to demonstrate below, additional information that may prove useful for the reconstruction of the term for 'two' can be obtained through the analysis of the term for 'seven'.

4.1.2.7.3 'Three', 'Four' and 'Five'


Table 4.37: Kainji stems for '3'-'5'

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Unlike the terms for 'one' and 'two', the numerals covering the sequence from 'three' to 'five' are quite homogeneous and thus can be reliably reconstructed (just as in the majority of other NC branches). The provisional forms suggested for 'three', 'four', and 'five' are *\*tat*, *\*nas,* and *\*tan* respectively. The latter form can also be reconstructed for Eastern Kainji on the basis of the Amo evidence. Thus *ʧibi* (*ʧi-bi*?) 'five' is an innovation of the Jera subgroup.

4.1.2.7.4 'Six' and 'Seven'


Table 4.38: Kainji stems and patterns for '6'-'7'

Some of the previously discussed terms for 'one', 'two' and 'five' are quoted in the table above alongside the terms for 'six' and 'seven'. Such grouping might facilitate a better understanding of compound numerals (if 'six' and 'seven' are indeed compounds) as well as the methodological and theoretical aspects behind their reconstruction. In addition, it might help to establish whether parts of compound numerals can be used to enhance the reconstruction of the primary numerical terms such as 'one', 'two', and 'five'.

The compound nature of the term for 'seven' is betrayed by its 'length': the forms quoted in the table normally have two to three syllables, whereas the primary numerals are as a rule mono- or (rarely) bisyllabic.

At the same time, in some of the cases the pattern '7=5+2' is immediately apparent (cf. languages 7–11, 13–15).

At this point, however, we will deal with those languages that show only faint (or no) traces of the pattern in question ('7=5+2'). E.g. in Tsishingini (16) we have to assume the pattern '7=X+2', where 'X' is an unknown element, whereas in language 12 the pattern is '7=5+X' (the relationship between 'X' and the term for 'two' is questionable).

Let us assume that the Proto-Kainji terms for 'two' and 'five' are \*CL**-re** (cf. e.g. Duka*\*jo-re* > *joor*) and \**tan* respectively. In this case, the compound term for 'seven' would be *\*tan-(*CL)*-re* or \**tan-X* (connector)-(CL)*-re*. The most typical diachronic scenarios for the emergence of the 'X'-patterns effective on the synchronic level are as follows:


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Any of these model processes may result in the loss of phonetic resemblance between a derived form and its source. This may lead to a situation where a derivation pattern is no longer recognizable by speakers. As a consequence, the term for 'seven' becomes opaque on the synchronic level and can no longer be analysed as '5+2'.

This means that the replacement of the original term for 'two' by an innovation does not affect the compound term for 'seven', i.e. that its second part is not automatically replaced. Moreover, in case there is sufficient evidence that the second of the aforementioned scenarios was applied, we may enhance the reconstruction of the primary term for 'two' on the basis of the compound term for 'seven'. E.g. the form *tʃéndʒe* suggests that the original Basa root for 'two' was *\*dʒe / re* and not *\*bi* as in the majority of the Kainji languages.

The available pertinent forms point toward the reconstruction of the Proto-Kainji form as *\*tan-da-re* ('5'-connector-'2'). The reconstructed forms for 'two' (marked with [\*] in Table 4.38) suggest a Proto-Kainji form *\*re* '2' and the pattern \*'7=5+2'. The Eastern Kainji forms for 'seven' are probably innovations.

However, some of the forms attested for 'seven' may point toward the reconstruction of 'two' as *\*ba/bi* in Proto-Kainji. In this case our reference list should be expanded by adding dialects that were not included for reasons of space: it is not possible to quote every single NC source every time. E.g. Cawai (Eastern Kainji) *a-ba* '2', *a-tar-ba* '7', Ngwoi (Hungworo) *e-bia* '2', *sa-bia* '7' (the root *\*ba/ bi* is also suggested by Eastern: Gure *pi-ba*, Gyem *ve*, Piti *ba*, Surubu *ka-va*).

The forms for 'six' are more problematic since they may go back to a primary root (or roots). They may be tentatively reconstructed as *\*ci(hi)n, \*tas,* and *\*tel*. We will come back to these forms in order to enhance their reconstruction in case similar forms are detected in other BC branches.

#### 4.1.2.7.5 'Eight'

The Eastern Kainji and Duka forms (if related) suggest that the primary root *\*-ru* should be reconstructed for 'eight' in Proto-Kainji. At this point, let us reserve a preliminary form \**u-ro/ ji-ru* for further comparison. In most of the Kamuku languages the pattern '8=5+3' is traceable (but note the Western Acipa form that is comparable to those attested in Kambari and possibly Amo (Eastern)). This points towards an alternative form of uncertain morphological structure (*\*kunle(v)/ kunlo* '8').



#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### 4.1.2.7.6 'Nine' and 'Ten'

There are several forms and patterns for 'nine' whose reconstruction is equally plausible: '9=5+4', \**tor(b)oj* (possibly <\*'10–1'), *\*jiro*. Each of the forms/patterns is characteristic of a particular sub-group of languages. The term for 'ten' is reconstructed as *\*pwa*, with its reflexes attested in all Western Kainji branches. Three alternative forms (*\*turu*, *\*kuri, \*kup/ kpa*) are found in Eastern Kainji, where they are employed for counting and in quantity measures.

#### 4.1.2.7.7 'Twenty' and 'Hundred'

The diversity of patterns for 'hundred' may indicate the absence of the term in Proto-Kainji. The term for 'twenty' likely followed the pattern '20=10\*2'. However, the form *\*ʃín/ ʃík* attested in three of the Western Kainji branches is noteworthy.

#### 4.1.2.7.8 Summary

It should be noted that a full reconstruction of the Kainji numeral system is not presently achievable for a number of reasons: some of the forms have multiple alternative variants, many terms are not attested outside Kainji (or have an obscure morphological structure), the elements of the compound terms are not always identifiable (e.g. in the patterns '7=X+2' or '7=5+X'), etc.

The numerals attested within this group are so peculiar (at least for a nonspecialist in the Kainji languages like myself) that one may wonder whether the Kainji group should indeed be treated as a branch of Benue-Congo. In any case, it seems reasonable to record all the forms reconstructable within the Kainji subgroups. These forms and patterns are represented in the table below (Table 4.40).

Table 4.40: Kainji summarized data for BC reconstruction



#### Table 4.41: Kainji stems and patterns for '9' and '10'


Table 4.42: Kainji stems and patterns for '20' and '100'

4.1 Benue-Congo

#### **4.1.2.8 Platoid**

#### 4.1.2.8.1 **'One'** (Table 4.43)

The grouping of roots here is admittedly provisional, because their morphological structure is often obscure. In addition, phonetic changes that may have taken place are unknown. It is very difficult to propose any etymological interpretation for the forms represented in the table. Which of them could be attributed to the Proto-Platoid is unclear (\*(*y*)*in* represents a possibility, in case noun class markers are indeed incorporated into the numerical terms).


Table 4.43: Platoid stems for '1'

Tesu data are taken from Blench & Kato 2012.

### 4.1.2.8.2 **'Two', 'Three' and 'Four'** (Table 4.44)

The roots for 'two' containing voiced and voiceless labials are attested in the Platoid languages (as well as in some other BC branches). They may be tentatively reconstructed as *\*pa/ fa/ ha* and *\*ba/ wa*.


Table 4.44: Platoid stems for '2', '3' and '4'

The roots for 'three' and 'four' are more stable. Some of their reflexes suggest that the Proto-Platoid forms must have been close to the NC forms: *\*tat* '3' and \**nai / \*nas* '4'.

#### 4.1 Benue-Congo

#### 4.1.2.8.3 **'Five' and 'Six'** (Table 4.45)


Table 4.45: Platoid stems and patterns for '5' and '6'

The term for 'five' is reconstructed as *\*tu(ku)n*. It is likely that there was no primary term for 'six' in the Proto-Platoid group: in all pertinent languages (except for Eggon, Hasha and Sambe) the term in question either follows the pattern '5+1' or is built by adding a plural class to the term for 'three'.

#### 4.1.2.8.4 'Seven' and 'eight' (Table 4.46)

Word-building patterns for the term for 'seven' are normally quite transparent: '7=5+2' is attested in the majority of the sub-groups, whereas '7=4+3' is more rare. The same can be applied to the term for 'eight', which either follows the pattern '8=5+3' or is built by partial reduplication of 'four' (4 redupl.). Sometimes the archaic primary terms for 'two' and 'five' are traceable in the forms for 'seven' and 'eight' (such forms are marked with an asterisk in the respective tables).


Table 4.46: Platoid stems and patterns for '7' and '8'

4.1 Benue-Congo

#### 4.1.2.8.5 **'Nine' and 'Ten'** (Table 4.47)

It is likely that the term for 'nine' attested in Ikulu, Yeskwa and Sambe (*toro/cora*) is primary. The hypothetical inter-relationship of these roots may be of interest for the Proto-Platoid reconstruction, because these languages do not belong to the same sub-group. The forms of 'nine' in the majority of the languages show traces of 'five', 'four', 'ten' and 'one', which suggests that two alternative patterns ('9=5+4' or '9=10–1') could have been in use. Some rare patterns (e.g. '9=12–3' (Birom) and '9=8+X (Tesu)) are of interest for the linguistic typology.

According to Bouquiaux (1962) the term for 'twelve' (*kūrū*) is attested in Birom. In this language '21' (*kūrū ná syāː-tāt*) = '12+9' (*syāː-tāt*), while '80' (*bākūrū bātīː mìn ná rwīːt*) = '12\*6' (-*tīː mìn ̄* ) + '8'*(-rwīːt*). The pattern '9=12–3' is not totally unexpected within this context. A similar system can be traced in the Mada language. As stated in our source (Abiel Barau Kato), "Like many languages in Platoid area, Mada has an old duodecimal numeral system up to 24."<sup>10</sup> The Mada terms for 'twelve' and 'twenty-one' are *tsɔ* and *tsɔtīyār* (*tīyār* '9') respectively. The same root for 'twelve' (*tsó* '12') is found in Ninzo for which our source notes that "In the traditional counting system, to count beyond twelve (12), that is from thirteen onwards, entails counting in sets of twelve."<sup>11</sup> Moreover, the same root is attested in Tesu (*tsɔ* '12'). According to Uche Aaron, a primary root *ɔ̀-cʷɔ́* '12' is discernible in Eggon (beside the composite term '12=10+2'). This root is also found in Rukuba (Che) in *u-sɔ́k* '12'. The duodecimal numeral system as attested in this language is of the utmost sophistication. According to Luc Bouquiaux: "There are two words for number '72', *kitu* and *atu*, 144 can be expressed as *atu ahak* and 200 is *atu ahak ni isɔk inas ni hak ni taːrat* ( 72 \* 2) + (12 \* 4) + 8."<sup>12</sup> Other languages in this group normally use less exotic systems. In some of them, however, e.g. in Eten, "The highest number that can be counted in traditional way is 144,"<sup>13</sup> i.e. '12\*12'. To sum up, it seems that a primary term for 'twelve' can be reconstructed on the Proto-Platoid level, hence the pattern for 'nine' should most probably be reconstructed as \*'9=12–3'.

The system outlined above adds a new perspective to the forms with the meaning 'ten'. Presumably, there was a Proto-Platoid primary term for 'ten' that may be tentatively described as \**kop*. The alternative forms*sok/swak* may be etymologically related to the forms for 'twelve' cited above. If so, their change of meaning may have resulted from the adoption of a decimal system. The root *gur/wur* is distinguished as well.

<sup>10</sup>https://mpi-lingweb.shh.mpg.de/numeral/Ninzo.htm

<sup>11</sup>https://mpi-lingweb.shh.mpg.de/numeral/Ninzo.htm

<sup>12</sup>https://mpi-lingweb.shh.mpg.de/numeral/Rukuba.htm

<sup>13</sup>https://mpi-lingweb.shh.mpg.de/numeral/Aten.htm


4.1 Benue-Congo

The specific nature of the Platoid numeral system prevents us from providing separate forms for 'twenty' and 'hundred'. The pattern \*'20=12+8' traceable in a number of pertinent languages is reconstructed for Proto-Platoid. A compound nature is also assumed for 'hundred'.

The results pertaining to the advanced reconstructions of numerals in Proto-Platoid are summed up in the table below (Table 4.48).


Table 4.48: Proto-Platoid numeral system (\*)

### **4.1.2.9 Nupoid**

Let us try to reconstruct the Proto-Nupoid numeral system.


Table 4.49: Nupoid numerals and Proto-Nupoid (\*)

The Nupoid group is relatively small and homogeneous and poses no problem for reconstruction.

4.1 Benue-Congo

### **4.1.3 Isolated BC languages**

### **4.1.3.1 Ikaan**

The following description of the Ikaan numeral system (Table 4.50) is based on the analysis of data from a number of its dialects.

#### Table 4.50: Proto-Ikaan numeral system (\*)


#### **4.1.3.2 Akpes**


The original BC forms for 'five' (*\*tan*) and 'one' may have been preserved in the term for 'six'. These forms will be treated below as hypothetical.

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### **4.1.3.3 Oko**


6 ɔ̀-pɔ́nɔ̀ɔ́rɛ (5+1) 100 í-pì

Table 4.52: Oko numerals

#### **4.1.3.4 Lufu**

Table 4.53: Lufu numerals


### **4.1.4 Proto-Benue-Congo**

#### **4.1.4.1 'One'**

The reconstruction of the term for '1' is objectively the most challenging (the term is especially difficult to reconstruct in languages with noun classes and complex systems of determinatives). This situation is even more complicated in the Benue-Congo languages, since more than one reconstruction of the term has been suggested. The existing hypotheses must be studied here, especially because the ones pertaining to the etymology of the term were proposed by Kay Williamson, the leading specialist in NC comparative studies. Moreover, Kay Williamson (1989b) used her reconstruction of the term for 'one' as an argument in favor of triconsonantal structure of Niger-Congo roots. This hypothesis has been actively developed by Roger Blench (2012b etc.).

#### 4.1 Benue-Congo

It should be noted that our evidence does not support Kay Williamson's reconstruction. Furthermore, her hypothesis regarding the triconsonantal nature of Niger-Congo roots is, in my opinion, untenable. The Bantoid data utilized by Williamson was discussed above. Now let us review the evidence she uses in support of her hypotheses. Originally she treated the root *#-kani* '1' as one of the basic BC roots ('old root', Williamson 1989b: 255). Later she changed her approach (on the basis of a wider NC context, namely on the data from the ljo languages) suggesting a derivation of BC froms from a triconsonantal root *\*\*- 'kə* ¯ *'gə* ¯ *ni* ¯ '1', for which she assumed a different set of reflexes (Williamson 1992: 396). The changes introduced by Williamson in this article are significant. She adds the reflexes of the reconstructed root in Akpes and Nupoid, includes its additional reflexes in Esimbi and Bekwarra (Bantoid), adjusts its reflexes in Cross and Platoid (e.g. by reinterpreting PUC gá-ni/ \*-gwá-nɩ̀ previously analysed as an isolated form as a reflex of the root in question), and, finally, omits Kanji and Jukunoid reflexes.

In further interpretation of the BC numeral systems we will use a template chart representing the fourteen branches of BC (Table 4.54). It should be noted that Bantu (as the largest sub-branch of the BC family with the most detailed reconstruction) is treated separately. This means that the Bantoid field will only include non-Bantu forms. The chart below reproduces the data published by Kay Williamson (middle sections) as well as the relevant forms obtained as a result of our step-by-step reconstruction (the rightmost section).

It should be noted that the difference in the results achieved by means of our step-by-step reconstruction (see above) and those of Williamson is significant. According to our evidence, the postulation of the root *\*\*- 'kə* ¯ *'gə* ¯ *ni* ¯ '1' for Western Benue-Congo is unsustainable. The existence of this root in Bantoid is also questionable. In her earlier publication, Kay Williamson quoted its only Bantoid reflex (*a-kina* '1') supposedly attested in Northern Bantoid Tiba (Williamson 1989b: 255). However, the affiliation of Tiba with the Bantoid languages is debatable (a connection with the Adamawa languages is suggested in Boyd 1999). In the article that followed, Williamson quoted another Bantoid form, this time the one attested in Southern Bantoid Esimbi (*keni* '1'). As noted above, this form was probably misinterpreted, becaused it includes the root *-ni/-nə̄*. At the same time, as I tried to demonstrate above, a number of related forms may be attested in the Mambiloid languages (Northern Bantoid): Twendi (Cambap) *tʃínī*, Mambila *tʃɛ́n*. Thus, we are possibly dealing with Proto-Eastern Bantoid *\*cin/kin*. In order to decide whether this form is an innovation or a reflex of an inherent Niger-Congo root (as Kay Williamson says) we need to place it into a wider linguistic context.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.54: BC \*kin/cin '1' and alternative reconstructions

Different colors are used in the charts to distinguish between the Eastern and the Western BC languages. A special marking is used for the Bantu languages due to their overall importance for the reconstruction. The abbreviations in the middle sections follow Williamson op. cit. with PLC-Proto-Lower Cross, PUC – Proto-Upper Cross, PP – Proto-Platoid.

This issue will be addressed later. At this point we will deal with another root for 'one' postulated by Williamson. According to her, the root is a Benue-Congo innovation.

Since the root *nə̄/ ni* is distinguishable in Esimbi, it seems logical to treat it together with another set of terms for 'one' (*#-diiŋ*). This data (termed BC innova-

4.1 Benue-Congo

tion by Williamson) compared to the results of our step-by-step reconstruction is quoted in the table below (Table 4.55).


Table 4.55: BC \*ni '1' and alternative reconstructions

Let us review the distribution of this root within the Benue-Congo branches.

**Western Benue-Congo.** This root can be reliably reconstructed in Nupoid and Defoid, but not in Edoid. In Igboid it might be attested in Ikpeye: *ŋì-nɛ́(ŋ-ìnɛ́*?). The root is possibly found in some of the Idomoid languages as well: Etulo, Agatu *ó-yè*, Idoma *é-yè*, Alago *ó-je*, Eloyi (dial.) *ò-nzé*, *ńɡwò-nzé*.

**Eastern Benue-Congo.** Several Kainji forms deserve closer attention. The Gurmana form quoted by Williamson is unfamiliar to me. It may be related to the Bunu form, but the root itself is uncommon for Kainji and thus cannot be reconstructed. Moreover, the root is only marginally attested in the Platoid languages (single occurrences include Eskwa *è-nyí* '1' and possibly Ikulu *í-ń-jí* '1', and *kɔ̀p-ìrì-zɨŋ̄* '11'). Another rare form is *di*(*n*) with an initial oral consonant (e.g. Ayu *ɪ-dɪ* '1', Eggon *ò-rí* '1' and its palatalized variant *tʃíŋ* – cf. *ɔ̀-kbɔ́ à-tʃíŋ* '11',

*ə̀-kβə́há là-tʃíŋ* '21'). These (etymologically unrelated?) forms, however, should not be reconstructed for Proto-Platoid, because the root *kin* (see above) is clearly distinguishable in the majority of the Platoid branches. At the same time, the Platoid data discredits the reconstruction of the root as \**kin*/*cin*. Multiple arguments can be adduced in favor of the interpretation of the initial velar as a reflex of an archaic noun class prefix, which would yield a Proto-Platoid form \**k-in*. This invites the possibility of an etymological connection between the Benue-Congo roots studied above, namely \*-*in* and \*-*ni*. The analysis of the Platoid compound numerals points toward the same conclusion. A number of noteworthy forms can be quoted in support of this, cf. Hyam *ʒìnì* '1' but *twaa-ni* '6' ('5+1', *twoo* '5'), Mada *tānn-ɛ̀n* '6' ('5+1', *tun* '5'), Ninzo *tānì* '6' ('5+1', *ʈʷí* '5'), Rukuba *tàiŋ* '6' ('5+1', *-túŋ* '5'). These Platoid forms bring to mind the case of the Jukonoid term for 'six'. Kay Williamson quotes a Proto-Jukunoid root *\*-yiŋ.* The reasons behind this reconstruction are not immediately apparent, since in the majority of the languages other forms are reserved for this meaning. Her reconstruction may be based on the compound terms for 'six' that follow the pattern '5+1' (or rather '5+X', with X ≠ 1), cf. e.g. Jibu *sùn-jin* '6' (*swana* '5', *zyun* '1'), *cìn-jen/ ʃì-ʒen* (*tswana* '5', *dzun* '1'). As noted above, the root in question is not reconstructable for the Platoid languages. The reconstruction of \**ni(n)* is assured only for the Eastern Benue-Congo branch (Cross), where it is systematically attested in at least three branches out of five, cf. Proto-Upper Cross (\**ni*), Central-Cross (*nin*), and Ogoni (*nɛ*). Since \**ni* can be safely reconstructed for Nupoid, Defoid and Cross, its further comparison to the pertinent roots attested in the languages that belong to other NC branches is required.

In conclusion, it should be noted that regardless of whether a conservative or a more speculative reconstruction (i.e. *\*kin* and *\*ni* vs. *\*k-in/ ni*) is preferred, the resulting root (or roots) is not tri- or disyllabic but rather monosyllabic.

In addition to this, several isolated roots for 'one' are attested in Benue-Congo. Undoubtedly, they represent local innovations. At first glance, this is applicable to the most common Bantoid roots for 'one', including the Bantu forms *mòì/mòdì mòtí*. This, however, may not be entirely correct for reasons that will be discussed in the next chapter. Another noteworthy root that may be tentatively described as *\*jir* is attested in both Oko and Platoid.

The table is subject to further interpretation. We will return to it later after the evidence from the other Niger-Congo branches has been collected. A few remarks are in order here:

1. Both Akpes terms for 'one' (*ē-kìnì, í-ɡbōn*) find close parallels in the Cross languages (*\*kin/cin, \*ni(n), \*gboŋ/gwan*). The Icheve form *à-mɔ́ɔ̀* is probably borrowed from one of the Bantu languages;


### **4.1.4.2 'Two'**

The root \**pa* (also found in the Idomoid languages) is reconstructable for Eastern Benue-Congo, but is not systematically attested in Bantu.

The Bantu form (as represented above) does not seem to be related to other Bantoid forms. However, it finds parallels in Defoid and possibly Akpes and Kainji. The most common BC form (\**ba*/*bai*) may go back to \**ba-i*, with \**ba*- being a noun class prefix. In this case, the BC form may be reconstructed as \**ba-di / ba-ji* > *bai* > *ba*, which would make the Bantu form the most archaic within Benue-Congo.

These hypotheses will be discussed below, after the evidence from the other BC branches has been reviewed.



### **4.1.4.3 'Three', 'four', 'five'**


Table 4.57: BC stems for '3', '4' and '5'

This is the most stable group of numerical terms within BC. It comprises the roots *\*tat* '3', *\*nai* '4', and *\*tan/ ton* '5' that are very well-known among the specialists in NC studies. Issues pertaining to the phonetic realization of their reflexes will be treated in the next chapter.

#### **4.1.4.4 'Six'**


Table 4.58: BC stems and patterns for '6'

As the table shows, there was probably no primary Proto-Benue-Congo root for 'six'. Two alternative patterns are traceable, namely '3PL' ('3 redupl.', '3+3') and '5+1'. Other forms are marginal. The phonetic resemblance of the Kainji and Igboid forms is noteworthy.

### **4.1.4.5 'Seven'**


Table 4.59: BC stems and patterns for '7'

A primary root for 'seven' is also indistinguishable. The form \**camba*/*samba* may have lost any phonetic resemblance to its Benue-Congo prototype \*7=5+2 in Proto-Bantoid. The Defoid and Edoid forms are phonetically comparable (a shared innovation?).

### **4.1.4.6 'Eight'**


Table 4.60: BC stems and patterns for '8'

In this case, the pattern *\*nai* '4' >*\*na(i)-nai* '8' fits the reconstruction better than its alternative. The similarity between Kainji and Defoid is peculiar and may be due to innovations.

### **4.1.4.7 'Nine'**


Table 4.61: BC stems and patterns for '9'

The rightmost column of the table includes many isolated forms (among them some primary ones). The term \**buka*, which may appear as an important BC innovation, is reconstructed for Proto-Bantoid. In addition, the pattern '9=5+4' is distinguishable in Proto-Benue-Congo. Like for '8', Defoid and Edoid forms closely resemble each other.

### **4.1.4.8 'Ten'**


Table 4.62: BC stems for '10'

This is a heterogeneous group of forms. The root \**pu/fu* attested in both Eastern and Western BC is the most likely candidate for BC reconstruction. However, it is missing from Bantoid, for which the term *\*kum/kam* is reconstructable. The latter form must be a Bantoid innovation. However, assuming that the second consonant may have undergone nasalization in Proto-Bantoid, this form is comparabale to a number of other roots, suggesting that *\*kup/ kop* should be reconstructed for Eastern Benue-Congo. As the table shows, other roots should not be neglected either. They will be treated in combination with the evidence from other NC branches.

### **4.1.4.9 'Twenty'**


Table 4.63: BC stems and patterns for '20'

It is highly unlikely that the Proto-BC term followed the pattern reconstructed for Proto-Bantoid (\*'20=10\*2'). In all likelihood there was no root for 'twenty' in Proto-BC at all. It should be noted that numerous branches of Western BC use the root (*g*)*bolo* (possibly related to the lexical root with the meaning 'sack') to make 'twenty'. A shorter root (\**gba/ gwe*) is reconstructable in the same Western BC branches as well. Its source is likely lexical: it is well-known that the term for 'twenty' in the NC languages often goes back to lexemes with the meaning 'man', 'leader', and 'body' (cf. Jukonoid). The resemblance between the reconstructed Idomoid and Nupoid forms is noteworthy. However, these forms might be etymologically related to the term for 'ten'.


**4.1.4.10 'Hundred' and 'thousand'**

Table 4.64: BC stems and patterns for '100' and '1000'

If Proto-Benue-Congo did not have the term for 'twenty', it probably did not have the term for 'hundred' either, because the only pattern it could follow is \*'100=20\*5'. In this respect the Proto-Bantoid innovation (\**kam*) is noteworthy. It resembles another Proto-Bantoid innovation, namely the term for 'ten' (\**kum*/*kam*), which is hardly a coincidence. The possibility that in the cases of 'ten' and 'hundred' we are dealing with alignment by analogy cannot be excluded. This could explain the irregular nasalization of the root for 'ten', cf. Proto-Bantoid*\*kup* '10' → *kum* by analogy with *\*kam* '100'. The term for 'thousand' was certainly nonexistent in BC.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### **4.1.4.11 Summary**

Taking this into account, the segmental reconstruction of the Proto-BC numeral system may be suggested (Table 4.65).

Table 4.65: Proto-Benue-Congo numeral system (\*)


This table gives an overview of the BC evidence that will be used for further comparison with other NC branches.

### **4.2 Kwa**

More than eighty Kwa sources were used for the reconstruction. They are representative of the major groups and sub-groups of this family, which consists of about seventy languages. A plausible internal classification of the Kwa languages does not exist. A step-by-step reconstruction of numerals may well be viewed as another important step in this direction. Our preliminary survey of the pertinent evidence is based on the traditional classification that distinguishes five major Kwa branches. We will start with the study of the numerical terms by branch. Then, individual reconstructions will be evaluated with regard to their potential for the general reconstruction of the Proto-Kwa numeral system.

### **4.2.1 Ga-Dangme**

These two languages exhibit isolated forms of the term for 'one'. Both terms will be preserved for further comparison (note that the first syllable of the Dangme term probably represents a noun class prefix). The term for 'eight' is undoubtedly constructed as '6+2'. The term for 'six' is primary, hence the term for 'seven' must be formed of '6+1'. This would suggest the existence of an additional term for 'one' (\*-*ɡō*/-*wo*). Two separate forms are attested for 'hundred'. Apart from that, the Dangme and Ga numeral systems are quite homogeneous.


The Adampe system is in many respects different, so there may be doubts as to whether it indeed belongs together with Dangme. The Adampe evidence will be treated later in this chapter.


Table 4.66: Ga-Dangme numerals

### **4.2.2 Gbe**

The reconstruction of the Proto-Gbe numeral system is straightforward, since alternative forms are few (Table 4.67). It is based on the available evidence from twelve of the Gbe dialects.

Table 4.67: Proto-Gbe numerals and patterns (\*)


The Gbe term for 'six' is primary. Its form, however, differs significantly from the (also primary) one attested in the languages of the Ga-Dangme group.

The term for 'eight' seems to be derived from 'four', whereas the term for 'nine' follows the pattern '8+1'.

The forms for 'twenty' follow the pattern 'X\*2' in Aja (*bulaa-ve*), Waci-Gbe (*blá-ve*) and Ewe (*blá-vè*), which suggests an alternative form for 'ten' (\**bula*).

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

The etymological relationship between the term for 'fifteen' and a lexical root with the meaning 'foot' attested in two of the dialects is an apparent innovation: Maxi-Gbe *à-fɔ̀-tɔ̃ ́*('foot', '3') and Kotafon-Gbe *fɔ́-tɔ̃ ̀* ('foot','3'). This pattern is attested in a number of the NC languages (including Atlantic).

A primary term for 'forty' is distinguishable (hence '50=40+10', '60=40+20', '70=40+30', '80=40\*2', '90=40\*2+10').

### **4.2.3 Ka-Togo**

Ka-Togo is a quite diverse group of the Left Bank languages. The reconstructions for each of its three branches are provided in the table below (Table 4.68). Its rightmost column lists forms and patterns that are the most likely candidates for the Proto-Ka-Togo reconstruction.


Table 4.68: Proto-Ka-Togo numeral system (\*\*)

It needs to be stressed that the forms marked with /\*\*/ are only suggestive and should not be taken at face value. They are not reconstructions in the strict sense and only serve for comparative purposes, so the absence of a tonal marker in a reconstructed form should not be considered meaningful. It only shows that at this point the available evidence does not allow reconstructing a tone in the pertinent case.

4.2 Kwa

### **4.2.4 Na-Togo**

An overview of numerical terms as attested in the branches of Na-Togo and some isolated languages is provided below (Table 4.69). A tentative reconstruction of the Na-Togo numeral system can be found in the rightmost column.


Table 4.69: Proto-Na-Togo numeral system (\*\*)

The Lelemi term for 'fifty' (*lì-tì*) is peculiar because it is a likely source of 'hundred': *è-tì á-ɲɔ́*('50\*2').

### **4.2.5 Nyo**

The Nyo group, which is comprised of dozens of languages, is the most representative within the family. For this reason (even though the Nyo numeral systems are closely related to each other) they will be studied separately (by sub-group) and then compared to each other.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### **4.2.5.1 Agneby (Abbey, Abiji, Adioukru)**

Alternative sources representative of these three languages are quoted below (Table 4.70). Significant variation of forms is sporadically attested.


Table 4.70: Proto-Agneby numeral system (\*)

The presence of the primary terms for 'seven', 'eight' and 'nine' is an important characteristic of this sub-group.

#### **4.2.5.2 Attié**

Internal reconstruction of the Attié numeral system yielded the following results (Table 4.71).

4.2 Kwa


Table 4.71: Attié numeral system (\*)

#### **4.2.5.3 Awikam-Alladian**

No numerical terms (except for 'one' and 'nine') are reconstructable on the subgroup level. This raises doubts as to whether these languages should indeed be grouped together. A representation of the pertinent forms is presented in the table below (Table 4.72) and may serve as a starting point for further discussion.

Table 4.72: Avikam-Alladian numerals


#### **4.2.5.4 Potou-Tano**

#### 4.2.5.4.1 Potou

The following forms are distinguishable in the Potou sub-group (Table 4.73).


Table 4.73: Potou numerals

#### 4.2.5.4.2 Tano

The Tano branch consists of nearly thirty languages. It seems reasonable to treat them by sub-groups.

#### **Western Tano**



4.2 Kwa


**Central Tano Akanic (Table 4.75):**

Table 4.75: Akanic numerals

**Bia** The numeral systems in these languages (Agni, Baoule, Sefwi, Nzema, Ahanta, and Jwira-Pepesa) are virtually identical and can be described as follows (Table 4.76).

Table 4.76: Proto-Bia numeral system (\*)


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

**Guang** This sub-group has two branches, Southern and Northern Guang, which consist of four and eleven languages, respectively). Despite, the Guang numeral systems do not differ significantly, hence quoting individual forms seems unreasonable. Our reconstructions for both branches, as well as the general Guang reconstruction, are given below (Table 4.77).


Table 4.77: Guang numerals

**Krobu; Basilia-Adele; Ega** To make our presentation complete, the evidence of these three isolated Tano languages is presented in the table below (Table 4.78).

### **4.2.6 Proto-Kwa**

Intermediate reconstructions suggested above should be compared in order to reconstruct the forms of the Proto-Kwa numerals. It seems reasonable to group potentially related forms (or patterns) together. The rightmost column contains isolated forms attested in one particular group only.

#### **4.2.6.1 'One'**

The Awikam-Alladian term for 'one' is definitely an innovation.

The root \**di* is attested in four branches out of five and thus is likely reconstructable at the Proto-Kwa level.


Table 4.78: Numerals in Tano isolated languages

Table 4.79: Kwa stems for '1'


The forms given in the left column are more problematic. Each of them contains a velar consonant (the Potu form \**ce* may have resulted from the palatalization of a velar before a front vowel, *ce < \*kue* – cf. Western Tano).

Regular phonetic correspondences between these languages have not been established and therefore cannot be used for purposes of reconstruction. In any case, the following considerations might prove useful for the NC reconstruction. The inventory of forms attested in the eighty Kwa idioms may seem rather diverse. However, only two of them may be considered for the Proto-Kwa reconstruction, namely *\*di* and *\*k(p)o* (or the compound form *\*di-kpo* suggested by the Gbe (*ɖe-kpo*) and Ega (*\*li-gɓó*?) forms).

### **4.2.6.2 'Two'**


Table 4.80: Kwa stems for '2'

The only form reconstructable at the Proto-Kwa level is evidently *\*ɲɔ*.

### **4.2.6.3 'Three' and 'Four'**


Table 4.81: Kwa stems for '3' and '4'

Just as in the majority of the NC branches, the roots for 'three' and 'four' are the most persistent. Suggested Proto-Kwa reconstructions are \**ta* and \**na* respectively.

### **4.2.6.4 'Five'**


Table 4.82: Kwa stems for '5'

The root \**tan* ('five') is only traceable in the Left Bank languages. Another root, commonly attested in other languages (\**nun*), is found in these languages as well. Both roots should be considered for the reconstruction (note that the former is comparable to the pertinent form reconstructed for Proto-Bantu).

4.2 Kwa

#### **4.2.6.5 'Six'**


Table 4.83: Kwa stems for '6'

The evidence presented in Table 4.83 is inconclusive. At this stage our task is to process the complex Kwa data so that it can be compared to the evidence of other NC languages. In this respect, three provisional Kwa forms are noteworthy: *\*golo/kolo*, *\*kua,* and *\*ciɛ.* In any case, as the forms for 'seven' suggest, the Proto-Kwa term for 'six' was probably primary.

### **4.2.6.6 'Seven'**


Table 4.84: Kwa stems and patterns for '7'

The forms presented in the table above point toward the pattern '6+1' being used for the Proto-Kwa term for 'seven', whereas Proto-Nyo developed the primary term \**sun*.

#### **4.2.6.7 'Eight'**


Table 4.85: Kwa stems and patterns for '8'.

Based on the evidence attested in the table above, the Proto-Kwa term for 'eight' may be reconstructed as either primary (*\*kwe/ kye*) or derivative, in which case it must have been based on 'four' (\*'4PL').

### **4.2.6.8 'Nine'**


Table 4.86: Kwa stems and patterns for '9'

This is the hardest form to interpret. A rare pattern '8+1' is attested in the Left Bank languages. In contrast to this, the Togo pattern is '10–1', while the Nyo term (\**brɔ*/*mrɔ*) is 'primary'. The latter is probably connected to the term for 'ten', although this connection does not necessarily imply a derivation ('10–1') and could be explained by analogy. All three forms/patterns are considered for reconstruction.

#### **4.2.6.9 'Ten'**


Table 4.87: Kwa stems for '10'

Isolated forms are attested in Ga-Dangme and Attié. The root *tə(b)* is traceable in the Ghana–Togo Mountain languages (Togo-remnant) and is not found elsewhere. Thus we are dealing with another isogloss suggesting that these languages belong to the same branch. The stem *\*du* supported by R. Blench could be proposed for Proto-Kwa. This stem is indeed attested in the majority of the groups that do not belong to the Left Bank languages (including Na-Togo).

The stem \**bula* (Left Bank)/\**bulu* (Tano) is distributed fairly evenly.

Finally, a Niger-Congo root reflected in Kwa as *\*fo/wo* can be reconstructed in a number of languages.

#### **4.2.6.10 'Twenty'**


Table 4.88: Kwa stems and patterns for '20'

The pattern '10\*2' attested in the majority of the branches. The root \**ko* is also to be taken.

#### **4.2.6.11 'Hundred' and 'thousand'**

In addition to the pattern '20\*5', the roots *lafa*/*lofa* and \**ya*/*ja* (Nyo) are reconstructable for 'hundred'. The latter may be etymologically related to \**ga*/*ha*.

The term for 'thousand' is commonly attested as \**a-kpi*. Its less common byform is \**pim*.



Table 4.89: Kwa stems and patterns for '100' and '1000'

Table 4.90 lists provisional Proto-Kwa reconstructions based on the evidence discussed above.


4 na 10 fo/wo, bula, du

6 golo/kolo, kua, ciɛ 100 20\*5, lofa, ja/gya?

5 nu(n), ton 20 10\*2, ko

Table 4.90: Proto-Kwa numeral system (\*)

The remaining roots and patterns are probably innovations that developed separately within a branch/language. They may help to adjust the internal classification of the Kwa languages.

1000 kpi, pim

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

### **4.3 Ijo**

According to traditional classification, the Ijo family is comprised of the Ijaw languages and the Defaka language. Some scholars express doubts as to whether the latter indeed belongs to this family. According to Roger Blench, "The Ijo languages constitute a well-founded group, but the membership of Defaka (constituting Ijoid) remains problematic. Defaka has numerous external cognates and might be an isolate or independent branch of Niger-Congo which has come under Ijo influence" (Blench 2013).

Ijaw languages consist of the Eastern and the Western groups (the latter is sometimes called Central).

The following reconstruction is based on the evidence of all three Ijo branches (Table 4.91).


Table 4.91: Proto-Ijo numeral system

Both qualifying and counting terms for 'one' are attested in the Eastern Ijo languages (e.g. in Ibani). The Defaka form may be a borrowing. An unexplained allomorph for 'one' is attested as a part of the term for 'six' in Ijaw (?).

The root for 'two' (\**mam*) is an Ijo innovation. It has no parallels outside this language family. Its phonetic similarity to several other forms is a mere coincidence, e.g. *ma*- in the Jaad (Atlantic) *maaɛ* does not belong to the root and can be

4.4 Kru

explained as a class prefix. The lexical meaning 'twin, pair' (as attested in Nembe (East) according to (Kaliai 1964)) may underlie the Ijo term. However, no reliable parallels for this term with the meaning 'twin, pair' are establishable in NC.

The root for 'three' is apparently of NC origin, with its most archaic form attested in Defaka.

The term for 'four' is undoubtedly a reflex of the NC root.

The term for 'five' probably goes back to the NC root *\*tan(o)*. As in the case of 'three', its most archaic form is found in Defaka.

The terms for 'six', 'seven', and 'nine' follow the common patterns ('5+1', '5+2', and '5+4' respectively).

The Ijaw term for 'eight' must have derived from 'four' by means of partial reduplication (*\*ni-nɛ́ín*). This pattern is reconstructable on the Proto-NC level and will be discussed at length in the next chapter.

A specific counting term for 'ten' is reconstructable in the Eastern Ijo languages (*\*àtìé*). The Defaka form is comparable to those found in the Ijaw languages.

A special form for 'fifteen' is reconstructable in Ijaw (*\*dié*), cf. e.g. the Nembe evidence: *dìé-èsí* '300' (='15\*20'). This form may go back to Ijaw *\*ɗɩɛ́ ̀* 'divide; separate into parts; split or break up into parts; share', 'distribute, donate', cf. Nembe *ɗɩɛ̀ ̀*, Ibani (Koelle 1963[1854]) *dìè-, dìé.*

As in a number of other languages that belong to different families within NC, a special form is attested for the term for 'twenty' (*\*síí*). The term itself has several functions. It serves as a basis for a number of other terms for tens (also in Defaka), e.g. '40=20\*2', … '100=20\*5'. The Ijaw terms for 16–19 are based on it as well, e.g. '16=20–4', etc.

### **4.4 Kru**

Our analysis of the Kru numerals is based on nearly forty sources representative of five major groups and eleven major subgroups of the family. Preliminary reconstructions of the pertinent numerical terms (by sub-group) are represented in commented tables below.

### **4.4.1 'One', 'Two' and 'Three'**

As in the majority of the NC languages the term for 'three' is the most persistent: the root *\*taa(n)* can be reliably reconstructed for Proto-Kru.


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Table 4.92: Kru stems for '1'-'3'

The same is applicable to the root for 'two' reconstructed as *\*so(n)* in Proto-Kru (isolated forms are attested in the Seme and Grebo sub-groups only). It should be noted that in general the Seme numeral system is peculiar in many respects. These peculiarities (e.g. Seme being the only language with a full set of primary terms covering the sequence from 'one' to 'ten') may be due to the isolated status of the language. In his recent article entitled "Le sèmè/siamou n'est pas kru" Vogler argues that Seme is not a Kru language (see Vogler 2015). On the basis of a comparison between Kru, Gur and Mande (Samogo) morphology and lexicon he concludes that Seme is either remotely related to the Mande languages or represents a separate branch of Niger-Congo. As we hope to demonstrate below, Seme shows systematic correspondences with neither Kru nor Mande (including the contact Mande languages – Samogo and Jowulu).

'One'. It is likely that the root *\*do* should be reconstructed on the Proto-Kru level. However, there is enough evidence for reconstructing the alternative root *\*(g)bolo.*

<sup>14</sup>Bassa, Dewoin, Gbii.

<sup>15</sup>Grebo, Krumen, Glio-Oubi.

<sup>16</sup>Wee is a Western Kru group which includes (among other languages) Sapo, Krahn, Nyabwa, Wobe.

### **4.4.2 'Four' and 'Five'**


Table 4.93: Kru stems for '4' and '5'

The forms for 'four' in the left column apparently are the reflexes of the NC root that is preserved in its archaic form \**na* in Eastern Kru, whereas in Western Kru it changes into *nyìɛ̀*.

Two major forms are observable for 'five', namely *\*gbə/ gbo* and \**mm* (Western).

### **4.4.3 'Six' to 'Nine'**

It is immediately apparent that these numerals already followed the pattern '5+X' in Proto-Kru. As noted above, the Seme forms are innovations.


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Table 4.94: Kru stems and patterns for '6'-'9'

### **4.4.4 'Ten' and 'Twenty'**

The root *kʊgba* is attested beside the common NC root for 'ten' (*\*pu/fu*) in Eastern and Kuwa. The root for 'twenty' is attested as *golo* in both Eastern and Western.

### **4.4.5 'Hundred' and 'Thousand'**

All Kru sub-groups are characterized by the lack of a primary term for 'hundred'.

The form for 'thousand' in Western Kru was borrowed from the Mande languages. A primary term for '400' (\**dwi*) that developed in Eastern Kru served as the basis for a rare pattern for 'thousand' attested in these languages ('400\*2+200').

The reconstruction of the Proto-Kru numeral system is given in Table 4.95.

Table 4.95: Proto-Kru numeral system (\*)



Table 4.96: Kru stems for '10' and '20'

Table 4.97: Kru stems and patterns for '100' and '1000'


4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

### **4.5 Kordofanian**

The evidence of about twenty Kordofanian languages does not permit reconstructing the Proto-Kordofanian numeral system (assuming that Proto-Kordofanian existed). Comprehensive data for each of the four major groups is represented below (Table 4.98). Forms and patterns traceable in at least two groups are in bold. The forms are grouped within the lines in a more or less ad hoc manner, e.g. there is no special reason to believe that Talodi *\*lu(k)/ li(k)* 'one' corresponds to the forms with initial **t-/ʈ-** attested in other groups.

The systematic presence of the final velar -**k** in some of the terms can also be found in the Atlantic languages (especially in North Atlantic).

The term for 'ten' appears in numerous forms in the Kordofanian languages, which is rare. At the same time, no root for 'ten' is represented in at least two languages simultaneously. Moreover, nearly every language in a group has its own term for 'ten'.


Table 4.98: Kordofanian numerals 1–5

#### 4.5 Kordofanian


Table 4.99: Kordofanian numerals >5

A primary term for 'eight' is distinguishable<sup>17</sup> in the Heiban and Rashad languages.

<sup>17</sup>I used data from the following Kordofanian languages and dialects: Aceron, Dagik, Heiban, Jomang, Katla, Koalib, Lafofa, Laro, Logol, Lumun, Moro, Nding, Orig, Rere, Shirumba, Tagoi, Talodi, Tegali, Tegem, Tima, Tira, Tocho, Utoro, Warnang.

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

### **4.6 Adamawa**

Adamawa is the most divergent of the NC families. The variety of numeral systems attested in the Adamawa languages confirms this statement. This can be observed not only in cases of forms that belong to different groups, but often within groups and sub-groups as well, which makes the reconstruction of its numeral system quite problematic. In other words, it is not a rare case that small Adamawa branches consisting of only a pair of languages show incomparable forms. Some examples are in order here.

Let us compare the terms from 'one' to 'ten' in the Kim branch that is commonly attributed to the Mbum-(Day) group (Greenberg 14) (Table 4.100).


Table 4.100: Numerals in the Kim branch

Only the terms for 'four', 'six', and 'ten' are comparable in these systems.

The Longuda language constitutes a separate branch of Waja-Jen (Greenberg 10). The table below gives an overview of the first ten numerical terms as attested in two dialects of Longuda (Table 4.101). The evidence for both dialects was collected by the same scholar (Ulrich Kleinewillinghöfer18). Morphological analysis of the forms is given according to Longurama of Koola (Longuda1) and Wala Lunguda (Longuda2).

Although we are dealing with two dialects of the same language, the roots for 'one', 'two', 'three', 'six', and 'ten' attested in them are different. At the same time, the terms covering the sequence from 'six' to 'nine' follow patterns com-

<sup>18</sup>https://mpi-lingweb.shh.mpg.de/numeral/Niger-Congo-Adamawa.htm

4.6 Adamawa


Table 4.101: Longuda numerals

monly attested elsewhere. Thus the differences between these dialects appear to be greater than those between the languages within Mande or Bantu families. This raises the question as to whether a Proto-Kim or Proto-Longuda reconstruction is indeed relevant.

Moreover, the reconstruction is additionally hindered by the fact that numerical terms in the majority of the Adamawa languages are subject to the alignment by analogy more frequently than in other NC languages. General considerations regarding this problem can be found in Chapter 3. This is of special significance for the Adamawa languages since it affects etymological interpretations. The evidence from a number of languages belonging to the Duru sub-group of Leko-Nimbari (Greenberg 4) may serve as a case study (Table 4.102).


Table 4.102: Duru numerals

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Matching final segments of the first few numerical terms in each of these languages are highlighted in red. I agree with Larry Hyman that "it might not be analogy, rather the use of a marker" (p.c.) but it should be noted that though these segments are different in each case (i.e. they do not match even within a pair of languages), they are present in each language under discussion.

In Mumuye-Yandang, which is another branch of Leko-Nimbari (Greenberg 5), an additional sub-morpheme (-t) is attested that is not present in Duru (Table 4.103).

Table 4.103: Analogical alignments in Mumuye-Yandang


The following conclusions with regard to the Proto-Duru numeral system can be reached upon the basis of this evidence. First, the final segments (whatever their phonetic difference) should not be viewed as a hinderance to the comparison of numerical terms. This means that Momi *tàáz* 'three' can (and should be) compared to Longto *tããbó*. The question of whether their final segments should be analysed as morphemes or sub-morphemes is of secondary importance for our purposes. At the same time, the quality of the second consonant in Proto-Leko-Nimbari is obscure, so we have to reconstruct the form as \**taa*X, where X is an unknown consonant.

As demonstrated above, numerical terms are exceptionally divergent within the family. In addition to this, systematic (diversified) alignment by analogy is often employed in the languages under study. Both factors make the reconstruction a challenging task, even though an attempt at reconstruction of the Adamawa numerals by a highly competent scholar is available (see Boyd 1989). His results, however, are of limited relevance for our comparative purposes, as the following example shows. According to Boyd, the Proto-Adamawa term for 'one' is to be reconstructed as *\*ku-di-n* (the root *\*di*) with \**kwin* being its later development. His ideas on how this proto-form is reflected in particular branches of the Adamawa family are summarized in the table below (Table 4.104). Notations in the first column refer to Grinberg's grouping of the Adamawa languages.


Table 4.104: \**kwin-* reflexes in Adamawa according to Boyd

Even if Boyd's reconstruction of the Proto-Adamawa form is correct, a diachronic interpretation that impies an etymological relationship between *bimbimi*, *cɔŋ*, *ɗu* and *gbet* does not fit the purpose of our integral comparative study of NC numerical terms because it can be used to justify nearly any etymological connection. In view of this, the Adamawa numerical terms will be treated in the same way as those from the preceding language families. First, the main forms of the numerical terms will be established, with no attempt at tracing them down to a provisional proto-form. Then the numeral systems of each of the Adamawa branches will be studied separately. Finally, an integral analysis of the available evidence pertaining to each of the terms will be offered. This approach will enable us to treat the Fali languages and even Laal together with the Adamawa languages, although their relationship to the latter is often questioned (in the case of Laal, doubts are raised as to whether it belongs to NC at all).

### **4.6.1 Fali-Yingilum (G11)**

It should be noted that after a nasal, -*r*- in the Fali forms regularly corresponds to -*N*- in those of Yingilum, cf. '5' Fali *kɛ* ¯ *rɛw* ~ Yingilum *kɛ́ɲàu*, '7' *jɔ* ¯ *rɔ* ¯ *s* ~ Yingilum *jə́nə̀s*. An alignment by analogy is probably attested in the terms for 'three' and 'four' (*\*taaX* > *taan* may have changed by analogy with *\*naan*).


### **4.6.2 Kam (Nyimwom, G8)**

Table 4.106: Kam numerals


Within the NC context, a reversive alignment by analogy may be considered: *\*na*X '4' > *nar* by analogy with *\*car* '3'. As Boyd rightfully observes, in the case of 'one' it is often unclear whether the initial consonant is a part of the root, or a reflex of the noun class prefix.

The term for 'seven' simulates the pattern '7=6+2' (this phenomenon is not infrequent in NC). Sometimes (e.g. in some of the Mande languages) this impression is due to the fact that the term for 'six' originally derived from '5+'. Over

4.6 Adamawa

time, an innovation replaced the original term for 'five', which was only preserved in the derived term for 'six'. Alternatively, the term for 'seven' could be explained as 'the other six' (or 'a big six' is some languages), as perhaps in Kam, assuming that *jù:p* does not go back to the term for 'five'.

### **4.6.3 Leko-Duru-Mumuye (G4, G2, G5)**

This group is often labeled Leko-Nimbari. Here we follow Raimund Kastenholz and Ulrich Kleinewillinghöfer, who note that "The term 'Nimbari' should not to be used as a classificatory term, nor should the scarce and surely in large parts erroneous data be given central significance in any comparative approach to Adamawa languages" (Kastenholz & Kleinewillinghöfer 2012).

#### **4.6.3.1 Duru (G4)**



This table provides an overview of forms and patterns attested in eleven sources for this sub-group. This degree of variety is not normally attested within a single sub-group, which raises doubts as to whether these languages should be grouped together.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### **4.6.3.2 Leko (G2)**

Our study of this sub-group is based on the evidence of two languages. The summary table above is not descriptive of the language-specific mechanisms of the alignment by analogy. An overview of the numerical terms covering the sequence from 'two' to 'five' by language is provided in Table 4.109.

Table 4.108: Leko numerals


Table 4.109: Analogical alignments in two Leko languages


Apparently, the terms from 'three' to 'five' in these two languages are related to each other. At the same time, two groups of terms ('2–3' and '3–4') with an alignment by the ultima are observable in Kolbila. This is applicable to a group of Samba Leko terms as well, namely '2–4' (possibly also '5'; the fact that the Samba Leko terms are adjusted by both the vowel quality and the tone is noteworthy). This means that the seemingly unrelated roots for 'two' may have derived from a common etymon (still unknown to us) by means of alignment by analogy. The source form of 'two' remains obscure. Assuming that it was similar to the one reconstructed for the Duru sub-group (e.g. \**ru*), it is likely that the same form is to be reconstructed for Leko as well: *\*ru* > Kolbila *nu* by analogy with *toonu* '3' ; *\*ru* > Samba Leko *rà* by analogy with *toorà* '3'. However, the evidence in favor of this reconstruction is inconclusive. Alternatively, the initial vowel of the term for

4.6 Adamawa

'two' (**\*ii-/in-**) may reflect the source root, while the final segment is potentially explained via an alignment by analogy with '3'.

#### **4.6.3.3 Mumuye-Yandang (G5)**


This sub-group is represented by three languages that show different forms of 'two'. The terms for 'three' and 'four' are adjusted by analogy. Studying them in a wider NC context reveals that the final consonant in 'four' was adjusted by analogy with 'three'. The alignment itself must have occurred already at the Proto-Mumue-Yandang level, which explains our provisional reconstructions suggested for this proto-language in the table above.

No evidence pertaining to the Nimbari numerals is available to us. The forms of 'one' given by Boyd (Boyd 1989) are noteworthy (Nimbari *(n)yeme/ geme/ (ʒeme?)*).

### **4.6.4 Mbum-Day (G13, G14, G6, Day)**

#### **4.6.4.1 Bua (G13)**

This is very divergent branch that has been poorly documented. I'd like to thank Pascal Boyeldieu who has provided me with his personal data on Ɓa (Bua) and Lua (Niellim), as well as some other rare sources. The main forms and patterns are shown in Table 4.111.

Numerals in the Bua group can be presented as follows (Table 4.112)


Table 4.111: Bua numerals

Table 4.112: Bua numerals (summarized)


4.6 Adamawa

#### **4.6.4.2 Kim (G14)**

The first ten terms of Besme and Kim are given in the table above (Table 4.100). The term for 'twenty' in these languages follows the pattern '10\*2', whereas the Kim term for 'hundred' is borrowed from Arabic. The Besme term for 'hundred' is borrowed from the French *sac* 'sack', whereas the term for 'thousand' is borrowed from Bagirmi.

#### **4.6.4.3 Mbum (G6)**

Table 4.113: Mbum numerals


This sub-group is represented by a dozen languages. Unlike Leko-Duru-Mumue no alignment by analogy is attested. Some forms of 'two' are of unclear morphological structure.

### **4.6.4.4 Day**

#### Table 4.114: Day numerals


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

This branch is comprised of an isolated language. Its attribution to Mbum-Day has been a subject of scholarly debate. The form \**mon* '1' is postulated on the basis of *sɛrì mòn ̄* 'six', whereas the reconstruction of*\*bīyām* (*\*bī-yām*?) '4' is based on *bīyām tà* 'seven'.

### **4.6.5 Waja-Jen (G9, G10, G1, G7)**

**4.6.5.1 Jen (G9)**

Table 4.115: Jen numerals


This branch is represented by two languages: Burak and Jenjo (Dza). The evidence from this group is among Boyd's best arguments for the reconstruction of *\*kwin* (<*\*ku-di-n*) 'one'. The primary term *li* (*bwa-li*) 'fifteen' is attested in Jenjo. Accordingly, the term for 'sixteen' follows the pattern '15+1' (*bwali ji tsɨnɡ*). Interestingly, in Burak the term for 'hundred' is *li* (*li kwín*).

The form *\*hwĩ* 'five' is traceable in Jenjo compound terms covering the sequence from 'six' to 'nine' (*hwĩ-tsɨnɡ* 'six', *hwĩ-yunɡ* 'seven', etc.) as is the corresponding Burak form *\*na* 'five' (*naa-ʃín* 'six', *náá-re* 'seven', *ná-tát* 'eight'). The form \**re* 'two' is observable in *náá-re* 'seven', whereas *\*ʃín* 'one' is traceable in *naa-ʃín* 'six'.

#### **4.6.5.2 Longuda (G10)**

The evidence for the first ten numerals in two Longuda dialects can be found in the table above (Table 4.101). The term for 'twenty' in these languages follows the pattern '10\*2'. The forms of 'hundred' are *pùlò(wé)/phulewe*.


tedu

Table 4.116: Waja numerals

#### **4.6.5.3 Waja (G1)**

Some languages in this sub-group are characterized by a sub-morphological alignment of the terms for 'three' and 'four' well-attested in Adamawa: Dadiya *tal* '3' ~ *nal* '4', Bangunji (dial.) 1 *táát* '3' ~ *náát*'4', Bangunji (dial.) 2 *taar* '3' ~ *naar* '4', Tula (Kɨtule) *jí-tːà* '3' ~ *jáː-nà*'4'. As a result, these terms are treated as minimal contrastive pairs in the paradigm. Within the NC context, forms with the final -*t* should be considered prototypical in the case of both terms. This means that *\*naa*X 'four' (final consonant unknown) may have evolved into \**naat* by analogy with 'three' in Proto-Waja. Later, an innovative form for 'three' developed in Awak and Waja: Awak *kunúŋ*, Waja *kunoŋ*. The Dijim-Bwilim *bwanbí* is apparently an innovation.

Interestingly, the froms for 'six' attested throuought the sub-group resemble the Awak and Waja forms for 'three'. However, the forms for 'six' can be explained as '5+1' (assuming that they include an allomorph of \**kun* 'one').

#### **4.6.5.4 Yungur (G7)**

The terms for 'twenty', 'hundred' and 'thousand' are attested in only one source (Kaan (Libo)) out of the eight sources available for this branch, hence they are quoted in brackets. Morphological analysis of the terms for 'one' and 'two' is unclear: *\*fV* may be a reflex of the original noun class prefix.

### **4.6.6 Laal**

Finally, let us turn to the Laal numeral system. Laal's attribution to the Adamawa languages (as well as its attribution to NC) is debatable. Today it is assumed that

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.117: Yungur numerals

it is an isolated case within Niger-Congo. Comparative study of its numerical terms may shed light on its genealogical relationship (Table 4.118).



As in many other NC languages, the major problem with Laal numerals is the obscurity of their morphological structure. Pascal Boyeldieu established that traces of noun class suffixes are observable in Laal forms as their comparison to sg and pl forms show (see Boyeldieu 1982). However, as I tried to demonstrate elsewhere (Pozdniakov 2010), some traces of noun class prefixes had been preserved in this language as well. At this point, it seems reasonable to set the alternative variants aside for further comparison.

What follows is an attempt to synthesize the Adamawa evidence.

### **4.6.7 Proto-Adamawa**

#### **4.6.7.1 'One'**

The main forms are given in Table 4.119.


Table 4.119: Adamawa stems for '1'

4.6 Adamawa

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

In accordance with Boyd's hypotheses discussed above, the forms in the first two columns may be related in view of the reconstruction of the root \**di* (possibly also \*-*in*), the noun class prefix \**ku*- and the suffix \*-*n* (\**ku-di-n* '1')

The last column lists forms that are attested in one of the branches only. The roots that can be tentatively reconstructed as *\*do*, \**nga/ngɔ*; *\*(g)bunuand* and *\*mon* are noteworthy.

#### **4.6.7.2 'Two'**

The main forms of this root are quoted in Table 4.121. The grouping of forms is admittedly not substantiated enough. The variety of forms within this family is striking, even when unrestricted phonetic grouping is applied.

#### **4.6.7.3 'Three'**

Comparative evidence for this root points to its reconstruction as \**taat* (with further alignment by analogy within each of the branches). As in the other NC families, the root is exceptionally stable, in contrast to the roots for 'one' and 'two' that demonstrate a wide variety of forms. A shared innovation in Jen and Waja (attested in Burak, Awak and Waja) is noteworthy.


Table 4.120: Adamawa stems for '3'


Table 4.121: Adamawa stems for '2'

161

### **4.6.7.4 'Four'**


Table 4.122: Adamawa stems for '4'

The main NC form *\*na*X is predominant here, its second consonant being subject to alignment by analogy. The same root is likely to be reconstructed at the Proto-Adamawa level as well.

### **4.6.7.5 'Five'**

The main root (*nun*) may be the same as in the Gur languages and may be etymologically related to the term for 'hand'. It is likely that the isolated forms quoted in the rightmost column go back to similar terms as well. The Jen root *hmə* could be a borrowing from Chadian Arabic: *xamsa* '5'. The Mbum forms *ndēɓē/ dūwēe* may be influenced by Fula (*jowi* 'five').


Table 4.123: Adamawa stems for '5'

### **4.6.7.6 'Six'**


Table 4.124: Adamawa stems and patterns for '6'

The most frequently attested pattern is '5+1'. However, there is a great variety of isolated forms (see the last column). The similarity between the Laal and Longuda forms is noteworthy; both may go back to Chadian Arabic *sitːe* 'six'. The Kim (and also Yungur?) form could be a borrowing from Bagirmi (*mìká* '6').

#### **4.6.7.7 'Seven'**


Table 4.125: Adamawa stems and patterns for '7'

As in the case of 'six', the predominant pattern ('5+2') for 'seven' is rather plain. It co-exists with a variety of isolated forms of uncertain etymology.

### **4.6.7.8 'Eight'**


Table 4.126: Adamawa stems and patterns for '8'

The pattern '8=4 redupl.' is to be reconstructed at the Proto-Adamawa level.

### **4.6.7.9 'Nine'**


Table 4.127: Adamawa stems and patterns for '9'

A primary term for 'nine' was apparently non-existent in Proto-Adamawa. A comparison between Bua *diar* and Kanuri *ləɣár* may be suggestive if a borrowing is considered. The same applies to the terms for 'nine' in Waja (*tɔɔrɔ*) and Hausa (*tara*).

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### **4.6.7.10 'Ten'**

Two alternative roots for 'ten' (Table 4.128) are distinguishable (\**boo* and \**kob* attested in four and two groups respectively). The root *d*(*u*)*o* is observable in two Mbum-Day sub-groups. Finally, the root *kutu*(*n*) is found in two languages, namely in Tunya (Bua) and Kaan (Yungur). Assuming that *ku*- is a class prefix, this root may prove to be related to *tūū* (Laal).


Table 4.128: Adamawa stems for '10'

4.6 Adamawa

#### **4.6.7.11 'Twenty'**

The term for 'twenty' (Table 4.129) in the Duru languages either follows the pattern '20=10\*2' or goes back to the lexical roots for 'head' and 'staff'. The Niellim term *do-ksap* was likely borrowed from Bagirmi *dùɡ sap* 'twenty'.


Table 4.129: Adamawa stems and patterns for '20'

### **4.6.7.12 'Hundred'**


Table 4.130: Adamawa stems and patterns for '100'

The fact that this term was massively borrowed (most likely simultaneously) from Fula and Arabic suggests that it was lacking in Proto-Adamawa. It can be assumed that the root *ru* attested in Bua and Yungur is also a borrowing, this time from Bagirmi *àrú* 'hundred'.

### **4.6.7.13 'Thousand'**


Table 4.131: Adamawa stems and patterns for '1000'

The term for 'thousand' was massively borrowed from Fula, Bagirmi and Hausa, which points to its absence in the proto-language.

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

### **4.7 Ubangi**

What follows is a preliminary analysis of the evidence of five separate language groups including Ubangi-Banda, Gbaya-Manza-Ngbaka, Ngbandi, Sere-Ngbaka-Mba (A. Ngbaka-Mba, B.Sere), and Zande.

### **4.7.1 Banda**

The form *gba* 'ten' is traceable in the Mbanza (Mabandja) terms for tens.



### **4.7.2 Gbaya-Manza-Ngbaka**

Table 4.133: Numerals in Gbaya-Manza-Ngbaka


Ives Moñino's reconstructions (Moñino 1995) are quoted in the table under an asterisk. Selected noteworthy forms are also included.

In the diachronical perspective, the forms *\*ɭíítò* and *\*bùà* 'two' probably included noun class prefixes. They go back to *\*-too* and *\*-wa* respectively (cf.*vàχ* '2' in Gbaya Mbodomo).

#### 4.7 Ubangi

In his discussion of *\*mɔ̀ɔ̀rɔ́*Moñino states that "La variante *\*mɔ̀ɔ̀rɔ́*semble être une contraction de *\*mɔ̀r-kɔ́* ˜ , dans laquelle on peut reconnaître l'élément *kɔ́* ˜ 'main' … " (Moñino 1995: 655). He also makes the folowing observation regarding the reconstruction of the term for 'ten': "*\*ɓú* 'dix' est en relation avec *\*ɓú* 'façonner, faire un cercle, joindre les mains'; la série partielle *ɓú-kɔ́* ˜ est encore plus explicite, et décrit le geste qui accompagne l'énonciation du chiffre 10 chez tous les locuteurs" (Moñino 1995: 656).<sup>19</sup> This is an important point, especially in view of the relatively frequent occurrence of *bu* in the NC languages and the possible etymological relationship between *\*ɓú* and phonetically similar forms attested in other branches. However, such a relationship would be doubtful within Moñino's etymological hypothesis.

The following etymology is suggested for 'hundred' by Thomas Elvis Guenekean: "The word *gɔ̃ ̀m* means 'cut' or 'gathered' and *n͡màː* means 'things'."<sup>20</sup> According to Moñino, the form literally means 'frapper-l'une l'autre (les mains)' (Moñino 1995: 657).

### **4.7.3 Ngbandi**

The Ngbandi and Yakoma evidence points toward the reconstruction outlined in the table below (Table 4.134).


Table 4.134: Numerals in Ngbandi

<sup>19</sup>However, in some Gbaya languages, these forms differ by tone: Gbaya (Roulon-Doko) ɓú '10' ~ ɓu 'to tap; to applaud, to roll'.

<sup>20</sup>https://mpi-lingweb.shh.mpg.de/numeral/Gbaya-Bossangoa.htm

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

### **4.7.4 Sere-Ngbaka-Mba**

Since the languages within this group are extremely divergent, it seems reasonable to treat the evidence from its two major sub-groups separately.

Ngbaka-Mba (Table 4.135)


Table 4.135: Numerals in Ngbaka-Mba

Sere (Table 4.136)

Table 4.136: Numerals in Sere


4.7 Ubangi

Sere-Ngbaka-Mba (Table 4.137)

Table 4.137: Sere-Ngbaka-Mba numeral system (\*)


### **4.7.5 Proto-Ubangi**

The evidence pertaining to each of the numerical terms is summarized below.

#### **4.7.5.1 'One'**


Table 4.138: Ubangi stems for '1'

Two competing roots (*\*le*/*ne* and *\*k(p)o(k*)) are distinguishable here.

### **4.7.5.2 'Two'**


Table 4.139: Ubangi stems for '2'

The only root widely attested within this family is *\*si/ʃi*.

### **4.7.5.3 'Three' and 'four'**


Table 4.140: Ubangi stems for '3' and '4'

The roots for 'three' and 'four' can be securely reconstructed as *\*taar* and *\*naar* respectively (with an alignment by analogy applied).


The Proto-Ubangi form is unclear, since the term for 'five' is based on the lexical root meaning 'hand' (*\*kɔ*) in two groups out of five. The only root whose attestations are not limited to a single group is *\*du(w)/lu(w).*

### **4.7.5.5 'Six'**


Table 4.142: Ubangi stems and patterns for '6'

In addition to forms that follow the common pattern '6=5+1', a number of other forms of uncertain etymology are attested in the first two groups (and possibly in Sere-Ngbaka-Mba as well, assuming that our morphological analysis of pertinent forms is correct).

### **4.7.5.6 'Seven'**


Table 4.143: Ubangi stems and patterns for '7'

The variety of forms attested in Ngbaka-Mba is noteworthy.

### **4.7.5.7 'Eight'**

Table 4.144: Ubangi stems and patterns for '8'


#### **4.7.5.8 'Nine'**

Apparently, at the family level the common pattern '5+' should be assumed for the terms from 'six' to 'nine'. Isolated forms attested in groups and sub-groups are quoted here (as well as in the cases of other families) in order to collect exhaustive evidence for further etymological analysis. Moreover, a small chance that the Niger-Congo proto-form is traceable within only a single branch should not be ignored.


Table 4.145: Ubangi stems and patterns for '9'

#### **4.7.5.9 'Ten'**



The reconstruction of the term for 'ten' is so problematic that it raises doubts as to whether it was present in Proto-Ubangi at all. In view of the convincing internal etymology suggested by Ives Moñino, the root \**bu* alternating with \**pu* and \**fu* in some of the NC families is an unlikely candidate. The reconstruction of *\*gba/ kpa* is worth considering. However, the root may not be primary.

### **4.7.5.10 'Twenty'**


Table 4.147: Ubangi stems and patterns for '20'

Two reconstruction possibilities are available here, i.e. the pattern '20=10\*2' commonly attested in NC, and a derivation from the lexical term meaning 'person'.

### **4.7.5.11 'Hundred'**

Table 4.148: Ubangi stems and patterns for '100'


Most of the forms are apparent borrowings which suggests that the term for 'hundred' was absent in Proto-Ubangi.

4.8 Dogon and Bangime

#### **4.7.5.12 'Thousand'**

The absence of the term for 'thousand' in Proto-Ubangi is even more evident than the absence of the term for 'hundred.'



### **4.8 Dogon and Bangime**

A step-by-step reconstruction of Dogon numerals does not seem reasonable because the family is relatively homogeneous. In addition, the formal differences between the numerical terms do not seem to correlate with the internal genealogical classification of the Dogon languages. The table below offers an overview of the pertinent data (Table 4.150) and is followed by a brief commentary.

Table 4.150: Dogon numerals


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

**'Two':** The forms with the nasal **n**- attested in several dialects are variants of the basic form with **\*l**-. It should be noted that the final palatal element is systematically attested in other numerical terms, e.g. in Ben Tey (Table 4.151).


Table 4.151: Final palatal in '2'

Regardless of whether this element is a morpheme or not, we are certainly dealing with a phonetic alignment by the final segment. Thus the final -**y** should not be reconstructed even in those forms that show its presence in the majority of languages.

**'Three':** This is a persistent form with only minor modifications applied to it (e.g. *taandu*, *taali*).

**'Four':** This is the only term for which the final palatal (probably nasalized) is potentially reconstructable. If so, systematic alignments by analogy attested in final segments of other numerals are probably based on the form of 'four'. The root *kɛɛso/ kɛ́ːjɔ́/ kɛ́:jɛ̀y/ cɛ́zɔ̀/ yè-cɛ́zɔ́* is probably an innovation (see, however, Jeff Heath who argues for its archaic nature.21) The term may be etymologically connected to the term for 'eighty', cf. Najamba-Kindige *sîm*, *kɛ̀ːsǔm*, Tommo So *kɛ̀ɛ̀súm* and a number of other related forms (Yorno So *dɔ̀gɔ̀-sǔm*'80', "Dogon hundred", Valentin Vydrin, p.c., Perge Tegu *dɔ̀gɔ̀-sǔŋ* '80', Yanda Dom *sìŋ* '80' etc.).

**'Five':** The etymological connection of this term with the lexical root meaning 'hand' *nùmà/ nùmó/ nùmɔ́/ nǒỹ* is immediately apparent.

**'Six' and 'seven':** These are probably primary terms.

**'Eight':** The root *sagi* attested in Najamba and Yanda Dom was probably borrowed from Mande. The forms *sila, seele* observable in a number of dialects may

<sup>21</sup>http://dogonlanguages.org

#### 4.8 Dogon and Bangime

be related to it. The root *gá(a)rà* is commonly attested in the majority of languages of this group, sometimes with a partial reduplication (Donno So/Yorno So/Toro So *ga-gara/ga-gira*). Partial reduplication is a popular means of deriving 'eight' from 'four' commonly attested throughout NC. In view of the fact that the Dogon counting system is based on 8, this root should probably be compared to *gàrá,* meaning 'big, large, a large quantity, a lot, go beyond (limit), more, to a greater extent'. Tonal differences may be neglected in this case, especially since the derived forms tend to be formally marked, e.g. tonally.

**'Hundred':** The basic 'large number' in Dogon is 'eighty' rather than 'hundred', so this meaning should probably be reconstructed for *siiŋ/suŋ*. In view of this, the fact that the term for 'hundred' was borrowed from Fula in nearly all Dogon languages is not a coincidence.

**'Thousand':** Similarly, the root *muɲu* (var. *mùsú / mùdʒú*) '800' incorporated into the pattern '1000=800+200' is reconstructed in Dogon.

The Bangime numeral system should also be considered here, since most of the numerical terms attested in this isolated language are comparable to those found in Dogon (Table 4.152).


Table 4.152: Bangime numerals

As in Dogon, the terms covering the sequence from 'six' to 'nine' are primary. An isolated root for 'forty' (also represented in some of the Dogon languages) is attested in Bangime. Interestingly, the root is the same as the one found in some of the Mande languages, cf. Bangime *dɛ̀ʋɛ́*, Dogulu Dom (Dogon) *dɛ̀ɛ́*, Mombo (Dogon) *dɛ̂ː*, Marka Dafing *dɛbɛ*, Bozo *dɛ̀bɛ́/ lɛ́wɛ̀*, Bamana *dɛ̀bɛ́*.

The root for 'ten' does not correspond to the one attested in Dogon. The latter finds a direct parallel in Boko (East Mande *kuri* 'ten'.

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

### **4.9 Gur**

It should be noted that the Gur languages are extremely divergent in the majority of their numerical terms (including those that prove to be fairly persistent in other families). The approach we took for the evidence studied above (i.e. the establishing of the most common forms and their further comparison to the data from other branches) may not appear fruitful in the case of the Gur languages.

To deal with the problem, we are going to use the classification of the Gur languages found in *Ethnologue*, namely A. Bariba, B. Central, C. Kulango, D. Lobi, E. Senufo, F. Teen, G. Tiefo, H. Tusia, I. Viemo, J. Wara-Natioro.<sup>22</sup> The Gur family comprises nearly a hundred languages. In terms of the classification outlined above, their distribution is uneven. Seven groups (Bariba, Kulango, Lobi, Teen, Tiefo, Tusia, Viemo) have an isolated language as their only member. Similarly, Wara-Natioro is represented by only three idioms. This means that the majority of the Gur languages are split between the two remaining groups, i.e. Senufo and Central. The former is comprised of about fifteen languages and is relatively homogenous. Its affiliation to Gur is often considered doubtful. Compared to Central, which embraces the majority of the Gur languages (nearly seventy), this group is relatively small. Two major sub-groups are identifiable within Central, i.e. Northern (38 languages) with Oti-Volta (33 languages) as the dominant branch and Southern (31 languages) with its dominant branch of Grusi (23 languages). In other words, 71 of the Gur languages (out of a total of 91) belong to either Oti-Volta, Grusi or Senufo. In addition to that, there are more than ten branches represented by a single isolated language each. No evidence points to their possible affiliation with the major branches or to their inter-relationship. The same can probably be said about several isolated languages affiliated (often uncritically) with the Central group (the Bwamu, Kurumfe, Dogoso-Khe, Gan-Dogosé, and Kirma-Tyurama branches). This already complex picture gets even more sophisticated in view of the following:


<sup>22</sup>This classification is accepted here with slight modifications based on recent studies. For instance, Dyan and Lobi are treated as members of the same branch.

4.9 Gur

two families is not clear at all. This means that some of the Gur branches may prove to be more closely related to Adamawa.

Our reconstruction of the Gur numeral system is based on nearly 120 sources that vary in regards to the evidence they offer (cf. our considerations above). By addressing one of the most problematic cases (i.e. the reconstruction of the Gur term for 'one') we hope to work out a general approach that will eventually allow further comparison of the Gur evidence to that of other NC families.

### **4.9.1 'One'**

The table below lists several forms of the term for 'one' in smaller Gur branches (Table 4.153).


Table 4.153: Diversity of stems for '1' in Gur

A brief study of these examples raises doubts as to whether the Gur numeral system is reconstructable at all (not to mention the Grusi-Northern system or those of the more isolated Gur branches).

Even if we consider one syllable roots of the CV(C)-type only, the impression will remain that every concievable root for 'one' is attested in the Gur languages. However, none of these roots is traceable in at least half of the Gur groups. This situation is reflected in the matrix below (Table 4.154).

The first figure refers to the number of groups where a form is attested (with a maximum of 10 groups), whereas the second one refers to the number of languages. Thus, **B-I** denotes a form comprising a voiced labial consonant (b, w or m) and a front vowel that is attested in five languages within three groups (Central, Lobi-Dyan and Senufo) (Table 4.155).

The remaining forms are quoted below as an illustration of their extreme divergency.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.154: Distribution of the CV(C)- forms for '1' in the Gur languages

Table 4.155: BI- forms for '1' in Gur (3 groups, 5 languages)


(1) a. **BA** (1/4) (Table 4.156).

Table 4.156: BA- forms for '1' in Gur (1 group, 4 languages)


b. **BU** (1/1): only *pú-wò* (possibly *púw-ò*, **PU?**) in Wara (J.Wara-Natioro)


Table 4.157: TA- forms for '1' in Gur


4.9 Gur

e. **DI** (3/15) (Table 4.158).

Table 4.158: DI- forms for '1' in Gur


#### f. **DU** (3/13) (Table 4.159)

Table 4.159: DU- forms for '1' in Gur


g. **CU** (1/2): only *mà-cɔ̃ ́*in Nateni (Central: 1. Northern: C.Oti-Volta: iii. Gurma

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

h. **JI** (1/19) (Table 4.160)

Table 4.160: CI- forms for '1' in Gur



Table 4.161: KI- forms for '1' in Gur


4.9 Gur

#### l. **KA** (1/2) (Table 4.162)

Table 4.162: KA- forms for '1' in Gur


m. **KU** (2/3) (Table 4.163)

Table 4.163: KU- forms for '1' in Gur


#### n. **GI** (1/5) (Table 4.164)

Table 4.164: GI- forms for '1' in Gur



The only lacuna in this presentation is due to the lack of forms with voiceless labial consonants (this, however, may not prove true in the case of Wara-Natioro, as we hope to demonstrate below). It should be noted that the general distribution pattern is that a single form is attested in one branch out of ten, three forms are found in both two and three branches, and none of the forms is recorded in four or more branches. This makes an attempt at tracing them down to a source form (with its further comparison to the evidence of the other families) unreasonable. In view of the genetic classification of the Gur languages and the considerations presented above, the optimum solution to the problem probably lies within separate reconstructions of numerals in the following sixteen Gur branches that belong to ten major language groups of this family, assuming that each of them may shed some new light on the reconstruction of the Niger-Congo numeral system:


Numerical terms as attested in each of these branches will be examined below.

Table 4.165: Bariba numerals


### **4.9.2 Bariba**

### **4.9.3 Central Gur**

#### **4.9.3.1 Northern Central Gur**

4.9.3.1.1 Bwamu

Table 4.166: Bwamu numerals


#### 4.9.3.1.2 Kurumfe

4.9.3.1.3 Oti-Volta

### **i. Buli-Koma (Table 4.168)**

#### **ii. Eastern (Table 4.169)**

Please note the extreme divergency of languages within this branch: the variety of forms presented in the table above are attested in only four languages, i.e. Biali, Ditammari, Mbelime and Waama.


Table 4.167: Kurumfe numerals

Table 4.168: Buli-Koma numerals


Table 4.169: Eastern Oti-Volta numerals


4.9 Gur

#### **iii. Gurma (Table 4.170)**

Table 4.170: Gurma numerals


#### **iv. Western (Table 4.171)**

Table 4.171: Western Oti-Volta numerals


#### **v. Yom-Nawdm (Table 4.172)**

Table 4.172: Yom-Nawdm numerals


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

**Proto-Oti-Volta** The evidence of five Oti-Volta branches (isolated forms excluded) is summarized in Table 4.173.


Table 4.173: Numerals in Proto-Oti-Volta

The reconstruction of the Oti-Volta numeral system is surprisingly unproblematic. In addition to the expectedly persistent reflexes of 'three' and 'four', homogeneous forms for 'two', 'five', and 'ten' are noteworthy. The term for 'eight' seems to be based on 'four' (either via the partial reduplication or according to the '4PL' pattern). In addition to that, Oti-Volta is characterized by the presence of the primary (homogeneous) forms of 'six', 'eight', and 'nine'. The forms of 'seven' are probably derived and follow the pattern '6+1'. It appears that the derivative form *\*lob-le* > *lole* is already reconstructable at the Proto-Oti-Volta level.

4.9 Gur

#### **4.9.3.2 Southern Central Gur**

4.9.3.2.1 Dogoso-Khe


Table 4.174: Dogoso-Khe numerals

The forms pertaining to these languages that are not present in the main databases are quoted according to Kerstin Winkellmann in (Winkelmann 2007d: 181–210). Although the numerals attested within the two languages of this group are quite persistent, Kerstin Winkellmann stresses their grammatical difference: *" …* while Dɔgɔ-sʊ uses noun suffixes, sʊ-Khe is a prefixing language " (Winkellmann 2007d: 209).

4.9.3.2.2 Gan-Dogose



Three of the languages belonging to this branch show too many forms, suggesting that we are dealing with a heterogeneous branch. In view of its numerical terms, it is not immediately apparent why this branch has been singled out.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### 4.9.3.2.3 Grusi

### **i. \*Eastern Grusi (Table 4.176)**

Table 4.176: Eastern Grusi numerals (\*)


#### **ii. \*Northern Grusi (Table 4.177)**

Table 4.177: Northern Grusi numerals (\*)


#### **iii. \*Western Grusi (Table 4.178)**

Table 4.178: Western Grusi numerals (\*)


4.9 Gur

The most probable \*Proto-Grusi reconstructions based on the roots attested in at least two Grusi branches are summarized in the table below (Table 4.179).


Table 4.179: Proto-Grusi numeral system (\*)

#### 4.9.3.2.4 Kirma-Tyurama

Table 4.180: Kirma-Tyurama numerals


### **4.9.4 Kulango**

The source form of the term for 'one' with a nasalized vowel is reconstructed on the basis of the evidence presented by Stefan Elders (2007: 323). As we have seen, the Gur term for 'five' is reconstructed as \**nu* on the basis of the evidence provided by the groups discussed above. It should be noted that this form goes back to the lexical root meaning 'hand' (Kulango *nu-gò*). The term for 'ten' in Kulango is a reduplicated \**nu*, whereas a different root is attested for 'five'. It is also noteworthy that the terms for 'two', 'three', 'hundred' and 'thousand' are borrowed from Mande.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.181: Kulango numeral system

### **4.9.5 Lobi-Dyan**

According to Anthony Naden's classification (Naden 1989), these languages belong to different groups of the Gur languages, so their evidence will be presented separately.

"More recent classifications (Labouret and Manessy) regarded Lobi (Lobiri) and Ja ¯ a ¯ ne as closely related" (Miehe & Tham 2007: 212) (Table 4.182).


Table 4.182: Lobi-Dyan numerals

4.9 Gur

### **4.9.6 Senufo**

Table 4.183: Senufo numerals


Many of the forms are quoted in brackets, i.e. they are isolated forms attested within the Senufo group comprising about fifteen idioms. As in a number of other Gur branches, the last syllable/segment of a numerical term often represents a coordinating noun class suffix. Below is an excerpt from the table showing the inflection of numerals by class in Tenyer (Syer variety), as published by Klaudia Dombrowsky-Hahn in Winkelmann 2007a:420, Table 4.184).

Table 4.184: Tenyer numerals (a fragment)


This presentation illustrates how problematic defining the numerical roots can be.

### **4.9.7 Teen**


Table 4.185: Teen numerals

### **4.9.8 Tiefo**

Table 4.186: Tiefo numerals


### **4.9.9 Tusia**


4.9 Gur

### **4.9.10 Viemo**

Table 4.188: Viemo numerals


### **4.9.11 Wara-Natioro**

It should be noted that the most important evidence pertaining to this group is relatively recent. In his publication of the comparative lexical list Tasséré Sawadogo noted that Faniagara is radically different from both Wara and Natioro (Sawadogo 2002). Its similarity index with the Natioro and Wara dialects is 12 and 30 percent respectively (the SIL list? idem., p. 15). Thus he had every reason to postulate the existence of an isolated language (Palɛn) in the Wara-Natioro group.

Since the data collected by Tasséré Sawadogo is absent from the major databases that are now incorporated into the RefLex database by Guillaume Segerer, it seems reasonable to present it below for each Wara-Natioro-Paleni idiom in order to suggest the reconstruction of numerical terms within each of the three sub-groups and within the group as a whole (Table 4.189).

According to other sources, the forms *wã́/ nwõ, sɔ* are attested in Wara-Natioro for 'twenty'. The patterns '20\*5' and '400\*2+200' are attested for 'hundred' and 'thousand' respectively.

<sup>23</sup>Regarding the Natioro forms for 'one' André Prost remarks: '*puwolo* (après un substantif: *kaaba)'* (Prost 1968: 78). Thus, the opposition between the Wara and Natioro forms of 'one' reflected in the table may be purely functional (for Wara Prost quotes the *puwo* and *kapo* forms).

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.189: Wara-Natioro-Paleni numerals

4.9 Gur

### **4.9.12 Proto-Gur**

#### **4.9.12.1 'One'**

The main forms of 'one' reconstructable in sixteen branches of Gur are as follows (Table 4.190).


Table 4.190: Stems for '1' in Gur

An attempt to reconstruct a Proto-Gur form is probably not reasonable at this point, since all the forms quoted above are important for comparative purposes.

### **4.9.12.2 'Two'**


Table 4.191: Stems for '2' in Gur

Apparent isolates and obvious borrowings are presented in the rightmost column.

### **4.9.12.3 'Three' and 'Four'**


Table 4.192: Stems for '3' and '4' in Gur

The reflexes of the most persistent NC roots are observable in the majority of the branches.

### **4.9.12.4 'Five'**


Table 4.193: Stems for '5' in Gur

The etymological relationship of \**nu* '5' and 'hand', is attested in Central Gur and possibly in Bariba and Senufo. Isolated bases may go back to this meaning as well. At the same time, the base preserved in Kulango, Teen and possibly Wara-Natioro-Paleni is comparable to \**tan* found in BC and some other families.

#### **4.9.12.5 'Six' and 'Seven'**


Table 4.194: Stems and patterns for '6' and '7' in Gur

The patterns \*'6=5+1' and \*'7=5+2' can be safely reconstructed at the Proto-Gur level. The exeptionally wide range of forms for 'six' attested in Senufo is noteworthy.

### **4.9.12.6 'Eight' and 'Nine'**


Table 4.195: Stems and patterns for '8' and '9' in Gur

In addition to the common patterns '8=5+3' and '9=5+4', alternative ones are attested for 'eight' and 'nine' ('8=4 redupl.' and '9=10–1' respectively).

**4.9.12.7 'Ten'**


Table 4.196: Stems for '10' in Gur

This term exhibits a variety of isolated (and possibly non-primary) forms. The main form has a voiceless labial as its initial consonant.

### **4.9.12.8 'Twenty'**


Table 4.197: Stems and patterns for '20' in Gur

In view of the great variety of forms and patterns attested for this term, the existence of the term for 'twenty' in Proto-Gur is uncertain.



Table 4.198: Stems and patterns for '100' in Gur

#### **4.9.12.10 'Thousand'**

No evidence supports the reconstruction of the term for 'thousand' in this family.


Table 4.199: Stems and patterns for '1000' in Gur

4.10 Mande

### **4.10 Mande**

The intermediate step-by-step reconstructions available for the Mande languages in Vydrin's Mande Etymological Dictionary and in Vydrin 2007<sup>24</sup> has made treatment of the data easier.

The genetic classification of Mande, outlined in the latter work, will serve as the basis for our analysis. This classification differs from the one suggested by Kastenholz and is accessible via *Ethnologue* (Simons & Fenning 2018). According to V. Vydrin,

Its major innovations, in comparison with that of Kastenholz, are the following:


Let us note an important fact: the numeral system of Jowulu differs considerably in certain points both from other Samogho languages and from Mande languages in general. It is interesting to outline that in R. Kastenholz's classification (based on the method of shared innovations, rather than on lexicostatistics) Jowulu is given a special status, more precisely, the first split in his Northwestern Mande branch (Bozo-Soninke + Bobo + Samogo + Jowulu).

Our further analysis will be based on the evidence from twelve branches of Mande represented in Figure 4.1.

<sup>24</sup>I would like to thank V. Vydrin for his suggestions and comments on the preliminary draft of this chapter.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


### **4.10.1 'One'**



Vydrin's preliminary reconstructions, as well as isolated forms resulting from the analysis of the numerical terms, are marked with an asterisk [\*].

The isoglosses for 'one' suggest the existence of two alternative roots (*\*dọ́*and *\*kelen*) attested in both major Mande groups. The latter root is distinguishable under the assumption that the forms with a voiced velar attested in the Eastern branch of the South-Eastern group (Matya Samo *ɡɔ̀rɔ́*, Southern Samo (Maka) *ɡôon*) are related to the **k**-forms found in Western Mande.

The next two roots, if related, may be suggestive with regard to the classification of Western Mande (otherwise, they probably represent similar unrelated forms). It should be noted that the root *ǹdá* (Susu *nde* 'one, certain', *ndende* 'anybody, whoever; nobody', Jalonke *ǹdá* 'certain') attested, according to Vydrin, in Susu-Jalonke may be related to *\*dọ*. The determiner *\*dọ́*, which can be reconstructed at the Proto-Mande level, goes back to the root *\*do*.

The rightmost column of the table embraces the isolated forms.

### **4.10.2 'Two'**


Table 4.201: Mande stems for '2'

A common root for 'two' that may be tentatively recorded as *\*pila / fila* is attested in all Mande branches. Its precise phonetic reconstruction is beyond the scope of our investigation. The reader can refer to the works of specialists in the historical phonetics of Mande. A reference designation that will enable us to compare this root to the evidence of the other NC families is sufficient for our reconstruction purposes.

### **4.10.3 'Three'**

The common root *\*sakpa/ sagba/ sawa* is represented in all Western branches. The relationship between some of the forms attested in the Eastern group (Southern Samo (Maka) *sɔɔ̄*, Matya Samo *̄ tjɔwɔ*) remains uncertain. The Jowulu form is especially peculiar. It should be noted that the forms of some numerical terms

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.202: Mande stems for '3'



differ significantly depending on the source. Our study is based on four Jowulu sources that provide the following evidence<sup>25</sup> (Table 4.203).

The terms for 'seven', 'eight' and 'nine' follow the pattern '3,2,1+'to lose'' respectively (cf. their inaccurate interpretation in Hochstetler, see §4.10.9), hence the reconstruction of the term for 'three' with the initial palatal (\**jɔ̀n*). The forms quoted in Jowulu for 'three', 'four', and 'ten' are uncommon. If we were dealing

<sup>25</sup>Hochstetler (1996); Djilla et al. (2004); Carlson (1993); Prost (1958).

#### 4.10 Mande

with a language with a noun class system, we would have to conclude that a noun class marker (cl19?) with two allomorphs (**p-** and **b-** before voiced and voiceless respectively) is traceable in the pertinent forms. However, we are dealing with a language that undoubtedly belongs to Mande, so no class-related morphemes can be involved. This leaves the presence of the initial labial in the term for 'three' unexplained. A borrowing from Gur or Kru cannot be assumed since these languages lack the comparable forms. The only plausible solution is the alignment of 'three' and 'four' by analogy with 'ten' where it must have been originally present.

A special term for 'three' appears in South-Eastern. In Eastern it can be reconstructed as *\*ʔààkɔ̃* or possibly *\*\*ʔàà-*(*kɔ̃*), cf. Bisa *kakʊ́*, Boko *ʔààɔ̃*(in Koelle 1963[1854] *ááɣo* ¯ ), Bokobaru (Zogbẽ)*ʔààɡɔ̃*, Busa *ʔààkɔ̃*, Maya Samo *kàakú*, Kyanga *ˀāàː*, and Shanga *ʔà*. The latter reconstruction is supported by the fact that the terms for 'three' and 'four' share the ultima, cf. the data are presented in Table 4.204.

Table 4.204: Final morphemes in the Boko-Busa numerals


It should be noted that in these languages, the syllable in question is also present in the terms for 'eight' that are built according to the pattern '5+3' (cf. e.g. Bobo Karu *sɔ́r-ààɡɔ̃*). Here we may be dealing with alignment by analogy, possibly with an additional final morpheme of uncertain meaning. It should be stressed that the ultima in 'three' and 'four' is never the same in the Eastern subgroup of the South-Eastern languages, whereas the medial velar is only attested in 'three' but not in 'four'. Assuming that the forms of the two Eastern branches are related, the term for 'three' can be reconstructed as *\*ʔààkɔ̃/yààká*, whereas the term for 'four' may be interpreted as resulting from the alignment by analogy with the forms of 'three' attested in the Eastern branch of South-Eastern Mande. The evidence in favor of its etymological connection with *\*sakpa* is inconclusive.

### **4.10.4 'Four'**

An easily recognizable NC form (*\*náání/ nɑ̃ɑ̃i*) can be reconstructed in Western Mande, whereas in South-Eastern Mande it is replaced with an innovation

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.205: Mande stems for '4'

(*\*si* ˜ *̀i* ˜ *̀yá* ˜ ). This innovation may also be attested in Jowulu.

### **4.10.5 'Five'**

Table 4.206: Mande stems for '5'


There is a correspondence between **d-/ l-/ s-** within Western Mande, hence the Eastern forms with the initial **s-** should not necessarily be treated separately. A discussion of the exact phonetic reconstruction is better left to specialists in the

field. For our purposes, it is sufficient to record that the Proto-Mande root for 'five' is reconstructed as *dúuru/ sɔ́ɔ́ru*.

However, the root(s) *\*wo, \*ko* are traceable in the compound numerical terms attested in Western Mande. They may be etymologically related to the lexical root meaning 'hand' (Vydrin, p.c.; cf. Proto-South-Mande *\*kɔ̀* 'hand'). The latter may be a NC root, cf. e.g. the term for 'hand' in Proto-Gbaya (*kɔ́* ˜ ), Dida (Kru) (*kɔ̄*) and in other languages.

The Jowulu and Samogo forms are peculiar. As we hope to demonstrate in the next chapter, two alternative roots for 'five' can be reconstructed for NC, namely *\*tan/ ton* and *\*nu(n)*. Both roots are directly attested in these marginal groups. Is this enough to reconstruct the terms for 'five' traceable in NC for the Mande languages? We will return to this question in the last chapter of the book.

**4.10.6 'Six'**


Table 4.207: Mande stems and patterns for '6'

The reconstruction of the Mande term for 'six' is problematic. The root *t(s)um* is worth considering, since it is attested in both Bozo-Soninke and Samogo (the root found in Susu is probably isolated). Its reconstruction at the Proto-Mande level is, however, unlikely. The common pattern '6=5+1' is attested in both major branches. The root*wɔrɔ*is non-primary and eventually goes back to the aforementioned pattern (or to the pattern '6'='hand'+1' to be precise). This hypothesis is supported by the forms of 'seven' as well.

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

### **4.10.7 'Seven'**


Table 4.208: Mande stems and patterns for '7'

A few remarks are in order before we turn to the discussion of the term for 'seven'. In the majority of the Mande branches, the term represents a compound. Its second element goes back to the term for 'two', cf. e.g. Jula *wólonfìlà* '7', *fìlà* '2'.

The relationship between the terms for 'six' and 'seven' is based on alignment by analogy. This bond sometimes results in unification of the terms, so that sources may explain 'seven' as '6+1' (despite the fact that 'two', not 'one', is manifestly present in 'seven'). This interpretation has become recurrent for the Mokole languages. According to Phillip Logan,<sup>26</sup> the Kuranko evidence is as follows: *wɔrɔnfila* ('**6+1'**) (⁈ –*K.P*.), *wɔrɔ* '6', *fila* '2', *kelen* '1'. The same idea is applied to Lele (cf. Marc Gebhard:<sup>27</sup> *wɔrɔŋ kela ('6+1'*),<sup>28</sup> *wɔɔrɔ* '6', *fela* '2', *kelɛŋ* '1') and Kakabe (cf. Daria Mishchenko:<sup>29</sup> *wɔ́rɔwila* (**'6+1'**), *wɔ́ɔrɔ* '6', *fìla* '2', *kélen* '1'). Other scholars are more reserved, stating that **'**Kono has a decimal system with special construction for 7'.<sup>30</sup> It is, however, quite evident that the forms in

<sup>26</sup>https://mpi-lingweb.shh.mpg.de/numeral/Kuranko.htm

<sup>27</sup>https://mpi-lingweb.shh.mpg.de/numeral/Lele-Mande.htm

<sup>28</sup>According to Vydrine (2009), the Lele term for 'seven' is *wɔ́rɔncɛla* (or *wɔyɛnkela* in the Southern dialect, https://mpi-lingweb.shh.mpg.de/numeral/Jowulu.htm) *núú ɡ͡bɔyɔ́nɡo* '20' ('person finished', https://mpi-lingweb.shh.mpg.de/numeral/Mende.htm)

<sup>29</sup>https://mpi-lingweb.shh.mpg.de/numeral/Kakabe.htm

<sup>30</sup>Raimund Kastenholz, https://mpi-lingweb.shh.mpg.de/numeral/Kono.htm

#### 4.10 Mande

question follow the pattern '5+2' (or at least 'X+2' with X being an unidentified component).

It is not a mere coincidence that the interpretation outlined above is recurrent in the Mokole languages, where the forms of 'six' and 'seven' have become partially unified. In a number of languages from other groups that have etymologically related terms for 'six' and 'seven', these terms differ in their second consonant, cf. Bamana (Manding): *wólonwula* '7', *wɔ́ɔrɔ* '6'.

In both groups of South-Eastern Mande the patterns '5+1' and '5+2' for 'six' and 'seven' respectively are still clearly recognizable (Table 4.209).

Table 4.209: Stems for '6' and '7' in South-Eastern Mande


Taking all of this into consideration, the most likely evolution scenario for 'six' and 'seven' is as follows:


### **4.10.8 'Eight'**


Table 4.210: Mande stems and patterns for '8'

The pattern '8=4\*2'/'4PL' commonly found in the majority of the families discussed above is barely attested in Mande. Meanwhile, the phonetic similarity between *naai* '4' ~ *ŋaai(n)* '8' (attested in the majority of the Samogo dialects) is hardly an accident.

The etymology of *kàà* (not found outside Seenku) is unknown.

The pattern '5+3' is inconclusive, because it often developss independently in various languages. The interpretation of the main Mande root (tentatively described as *seki/ segi*) is uncertain. On the one hand, its current forms suggest that this root can be reconstructed not only for Proto-Western Mande, but for Proto-Mande as well (cf. South-Eastern forms, in particular *sa̋ȁgā* '8'). On the other hand, such reconstruction is hindered by at least two issues.

Firstly, the second velar in the South-Eastern Mande forms does not belong to the root. It is part of a reduced segment that goes back to the term for 'three' (cf. Tura *yȁká* '3'), whereas the first segment goes back to the term for 'five' (cf. Tura *sőlű*, *sőőlű*, *sʋ̋lʋ̋*). The comparative analysis of the forms of 'eight' attested in

the South-Eastern Mande languages (not quoted here in detail) strongly suggests that the South-Eastern Mande pattern for 'eight' is '5+3'.

Secondly, this reconstruction is problematic from a typological point of view. As has been demonstrated above, our evidence prevents us from reconstructing primary roots for 'six' and 'seven'. In terms of typology, a primary root for 'eight' would look highly unusual in this context. Such a root could be expected in those few numeral systems where 'eight' is a basic numeral (just like 'twelve' is a basic numeral in some of the Benue-Congo numeral systems described above, hence '100=12\*8+4'). However, 'eight' has never been a basic unit of counting in Mande systems. The existence of a primary term for 'forty' (assuming that 'forty' is '8\*5') in some of the Mande languages could be interpreted as a hint at a special status of 'eight'. However, this is not supported by any real evidence.

This raises a question about the etymology of the Western Mande term for 'eight' (*seki/ segi*). Its resemblance to the term for 'three' (especially in Bozo and Soninke, cf. Jenaama Bozo *síkɛ̃ ̀ũ* '3' ~ *sèkːí* '8') may be suggestive here. Is there enough evidence to reject the hypothesis that 'eight' in the Proto-Western Mande was built according to the pattern '8=plus 3' (this would assume a counting reference to 'five')?

Despite the doubts expressed above, these forms are worth comparing to other forms of 'eight' attested in other NC families.

### **4.10.9 'Nine'**


Table 4.211: Mande stems and patterns for '9'

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Two competitive patterns are distinguishable here ('9=5+4' and '9=10–1'). In some of the branches (e.g. SWM, Vai-Kono) they are attested side-by-side.

At the same time, these patterns cannot be postulated for some of the languages without additional support. The pattern '9=10–1' seems to be apparent in South-Eastern Mande and some of the SWM languages only, cf. Boko '9': *kɛ̃ ̀okwi* (litː 'tear away 1 (from) 10'), *kwi* '10' ; in Busa '9': *kɛ̃ndo/kĩ ́ ń* ˙ *dokwi* (litː 'tear away 1 (from) 10'), *kwi* '10', *do* '1'; in Bandi (SWM) *taá-vu* '9', *ìtá(ŋ)* '1', *púu* '10'. According to Robert Carlson (Carlson 1993: 30), the terms from 'seven' to 'nine' in Jowulu follow the pattern '1–3' + 'lose' (*fɔ́nì*), i.e. *jɔ̃ɔ̃-pɔ́nì* '7', *fúl-pɔ́nì* '8', and *tẽ̀ẽ̀-pɔ́nì* '9' (note that these terms are misinterpreted as 3+4, 2\*4, 5+4<sup>31</sup> by Lee Hochstetler).

The root *kònonto/kɔ̀nɔndɔ(n)* attested in Manding and Mokole is unclear and deserves discussion by specialists. On the contrary, the forms interpreted as the combination of '5+4' in the table below seem to be quite transparent (Table 4.212).


Table 4.212: '9 = 5+4' in Mande

This section, however, is not unproblematic. The Jogo-Jeri non-primary terms for '6–9' are formed by two components. The second (i.e. the terms for 'one', 'two', 'three' and 'four' respectively) is easily recognizable, whereas the etymology of the first (**ma**-) is unclear.

### **4.10.10 'Ten'**

This term is especially interesting in light of the fact that the distribution of the isoglosses of 'ten' served as the basis for Maurice Delafosse's early classification of the Mande languages including the *Mande-tan* and *Mande-fu* groups. These two roots are indeed the main Mande roots with this meaning. However, their distribution does not correspond to the two major branches of Mande as they are distinguished today. The root \**tan* is indeed found in all groups of the Western

<sup>31</sup>https://mpi-lingweb.shh.mpg.de/numeral/Jowulu.htm


Table 4.213: Mande stems for '10'

branch except for Bobo and SWM. However, the attestations of the root \**pu*/*fu* are not limited to South-Eastern and extend to a number of the Western branches such as Bobo, SWM, Susu (and possibly Manding-Mokole, assuming that its reflex denotes tens in compound numerals). Isolated forms attested in South-Eastern and in peripheral Western languages are noteworthy.

The reconstruction of \**pu*/*fu* for Proto-Mande and the interpretation of \**tan* as the Proto-Western Mande innovation seem well-founded.

The etymology of \**tan* is obscure. Its similarity to the locally attested root \**tan* (cf. Soninke *tàán* 'foot, leg'; 'wheel'; 'time' (when counting), Bozo Tieyaxo *tɔn* 'foot, leg'; 'time' (when counting), Bozo Hainyaxo *tǎ*, Bozo Tiemacewe *tawa*, Bozo Sorogama *taba*) is likely a coincidence. Lexical roots with the meaning 'foot' are attested in NC numeral systems, usually as a basis for the non-compound terms for 'fifteen'. The logic behind this development is simple: 'ten' is 'two hands', 'twenty' means 'man', i.e. 'two hands and two feet', hence 'fifteen' is 'foot'. This seems to be the case for Boko and Busa, where a non-compound term for 'fifteen' (*ɡɛ̃ ̀o/ ɡɛ̃ ̀ro*) is attested (hence '16=15+1' in these languages). This root is etymologically related to 'foot, leg' in Duungoma (Samogo) *gẽ*, Dan *gɛ̂* ˜ , Mano *gà* ˜ (it should be noted that within Mande a non-compound root for 'fifteen' is also attested in Ligbi, cf. *tíɡán / tiɡa* '15', *tíɡá-ló* '16).

In addition, a similarity to the term for 'one' as attested in some of the languages must be a coincidence.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

A hypothesis assuming a semantic shift \*NC *\*tan* '5' > Proto-Western-Mande *tan* '10' in parallel with the development of the Mande innovation \**dúuru/ sɔ́ɔ́ru* 'five' seems to be a better explanation.

It bears reminding that the Bokobaru root *kuri* 'ten' has a direct parallel in the isolated Bangime language (*kúrɛ́*. Cf. also Boko *kúúli* recorded by Koelle).

### **4.10.11 'Twenty'**


Table 4.214: Mande stems and patterns for '20'

There is every reason to believe that the term for 'twenty' was based on the lexical root(s) meaning 'human person' at the Proto-Mande level. The etymology of some of the isolated forms presented in the table should be sought with this in mind.

### **4.10.12 'Hundred'**

The root *kɛmɛ*, widely attested throughout Western Africa, is noteworthy. Its original semantics deserve a separate study: it is well known that in some languages this root can be used for 'sixty' or 'eighty' and not for 'hundred' (the archaic Bamana counting system: *màninkɛ̀mɛ* '60', *bámanankɛ̀mɛ / kɛ̀mɛ* '80', *kɛ̀mɛ ní mùgan* '100' (80+20)) (Vydrin & Perekhvalskaya 2015: 360).

<sup>32</sup>Mende *núú ɡ͡bɔyɔ́nɡo* '20' ('person finished'). https://mpi-lingweb.shh.mpg.de/numeral/Mende. htm

4.10 Mande


Table 4.215: Mande stems and patterns for '100'

### **4.10.13 'Thousand'**

The roots for 'thousand' attested in the Mande languages were borrowed from by the Western African languages. The original meaning of the Mande root *wáa/ wága* may be 'a basket of cola nuts' (Perekhvalskaja, Vydrin & Perekhvalskaya 2015: 361), cf. Bamana *wágá* 'panier à colas', Bobo *wágá* 'panier qui sert à transporter les colas ou wòlōwágá.'

Table 4.217 gives an overview of Mande forms and patterns that will be used for further comparison to the evidence of other families (Table 4.209).


Table 4.216: Mande stems and patterns for '1000'

Table 4.217: Numerals in Proto-Mande


4.11 Mel

### **4.11 Mel**

A narrow definition of the Mel family is preferred here (in accordance with the classification of the Atlantic languages suggested in (Pozdniakov & Segerer 2017). This family comprises two compact language groups, namely Northern (Temne, Landuma, and all Baga languages except for Baga Fore and Baga Mboteni, namely Baga Koba, Baga Maduri, Baga Sitemu and others) and Southern (Kisi, Sherbro, Mani, and Krim). Sua, Limba and Gola are not included within the Mel family and are viewed as isolated NC languages. The numeral systems of the two Mel groups comprised of the distant languages are treated separately below.

### **4.11.1 Southern Mel**


Table 4.218: South Mel numerals

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

Noun class markers are usually positioned as suffixes in Kisi. However, the first numerical terms in this language have noun class prefixes, which makes the forms look inconsistent, cf. *mùúŋ/ mìɔ́ɔ́ŋ / ŋìɔ́ɔ́ŋ / dìíŋ, tìɔ́ɔ́ŋ/là-tìɔ́ɔ́ŋ* 'two'.

The terms for 'hundred' and 'thousand' were probably absent in Proto-South-Mel. The similarity between Kisi *tɔ́* 'ten' and Bullom-Mani *tɔ̀ŋ* 'twenty' is noteworthy. 'Twenty' may follow the pattern '20=10pl'. If so, the original *tɔ̀ŋ* 'ten' should be viewed as an early borrowing from Western Mande (*\*tan* '10'). In this case, *\*wan* '10' is an innovation (probably based on *\*wan/wen* 'five') that developed in South Mel after Kisi had separated. The numeral system of modern Kisi exhibits no significant changes from the forms described by Koelle. It includes the form *ŋam-puum* '6' (Tucker Childs: *ŋɔ̌ŋpúm*) that may have retained an archaic allomorph of 'one' (*\*pum*). The forms that will be used for further comparison are summed up in the table below (Table 4.219).

Table 4.219: Proto-South Mel numeral system (\*)


### **4.11.2 Northern Mel**

A higher degree of homogeneity observable in these languages allows an instant reconstruction of their numeral system at the Proto-Nothern Mel (Table 4.220)

Table 4.220: Proto-Northern Mel numeral system (\*)


4.12 Atlantic

### **4.11.3 Proto-Mel**

The table below gives an overview of South Mel and North Mel forms (Table 4.221).


Table 4.221: Proto-Mel numeral system (\*)

### **4.12 Atlantic**

Our step-by-step reconstruction of numeral systems in the Atlantic languages will be based on their classification suggested in Pozdniakov & Segerer 2017 (forthcoming) that distinguishes two main groups within the Atlantic family, namely Northern and Bak.

### **4.12.1 Northern**

The numeral systems of Northern Atlantic are treated below by sub-group.

**4.12.1.1 Cangin**



Some of the reconstructions presented above are not immediately apparent and are in need of additional commentary. A detailed discussion of each of them

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

would be impossible here, so we will take the reconstruction suggested for 'four' (*nik-iɭ*) as a sample.

At first glance, the forms of 'four' attested in the Cangin languages have nothing in common. Two of the five Cangin languages have *kinil* 'four' (Ndut-Palor), whereas in the remaining three (Laala, Noon, and Safin) *nikis* is used in this function. The easiest solution to the problem would be to postulate two alternative forms for this group. However, as the evidence of comparative-historical phonetics suggests, the final **-l** in Ndut-Palor regularly corresponds to the final **-s** in Laala-Ndut-Safin (Table 4.223).

Table 4.223: l ~ s regular correspondence in Cangin


This fact alone urges closer examination of the forms quoted above. Further analysis shows that a fossilized noun class prefix **kV-** is present in some of the Palor numerals, cf. *ka-nak* 'deux', *ke-jek* 'trois', *ki-nil* 'quatre', *kip* 'cinq. At the same time, the suffix -**Vs** is observable in the Noon numerals, cf. *jet-us* 'five'. This evidence combined suggests the following development of the forms for 'four' (Table 4.224).

Table 4.224: Development of \**nik-Vɭ* '4' in Cangin


### **4.12.1.2 Nyun-Buy**

Numerical terms are highly divergent within this sub-group, so it seems reasonable to treat them by branch (Table 4.225).

4.12 Atlantic


Table 4.225: Nyun-Buy numerals

The pattern '5'='hand' ~ '10'='hands' is immediately apparent in Nyun. In the case of Buy, it can be accepted only under the assumption that the derived term for 'five' became phonetically distant from its source form, cf. Kasanga *ji-rek*, Kobiana *ji-hak* 'hand' (these forms must be related to Nyun *ci-lax* 'hand'). In any case, the Kasanga term *ŋaa-rooɡ* follows the pattern '5PL' that uses the same plural noun class as the one attested in *ŋa-rek* 'hands'.

The forms for 'ten' attested in Joola Ejamat (Atlantic Bak) *si-ntaaja* is important for the diachronic interpretation of the Kobiana form *ntaajã*. The evidence suggests that the latter was probably directly borrowed from Joola<sup>34</sup> (as was -*anɔʔ* 'one').

#### **4.12.1.3 Jaad-Biafada**

The forms of 'one' (*ɲi/ nɛ*) are distinguishable in the compound numerals, cf. Jaad *ŋka-inɛ* '6' ('5+1'), Biafada *mpaaji nyi* '7' ('6+1'), etc. The term for 'five' goes back to the lexical root meaning 'hand' (Biafada *gə-bəda*, Jaad *ko-bəda*).

<sup>33</sup>Guillaume Segerer (p.c.).

<sup>34</sup>According to Guillaume Segerer (p.c.) it is possible that the Ejamat and Kobiana forms both come from Manjak.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.226: Jaad-Biafada numerals

### **4.12.1.4 Tenda**

The reconstruction of the Proto-Tenda numerals (Pozdniakov 2016) is based on a comparative analysis of five Tenda languages: Basari, Tanda, Bedik, Bapen, Konyagi.

Table 4.227: Tenda numerals (\*)


The etymology of the Konyagi term for 'five (*mbəɗ* ) is based on the Jaad-Biafada evidence (these languages belong to the same sub-group as Tenda).

#### **4.12.1.5 Fula-Sereer**

The numerical terms are highly divergent within this sub-group, so it seems reasonable to treat them by language (Table 4.228).

The fact that the Seerer terms covering the sequence from 'two' to 'five' have the same final segment is noteworthy. This could potentially be interpreted as a special morpheme or as a sub-morpheme that resulted from alignment by analogy. This discussion will be resumed below. Here it can only be stated that the

<sup>35</sup>Reviewing my first version of the book, Guillaume Segerer has advanced a new interesting etymology for Fula: *jow-i* '5' = *jun-ngo* <*jow-ngo* 'hand'. His hypothesis is quite possible.

4.12 Atlantic


Table 4.228: Fula-Sereer numerals

morphological analysis of the Sereer term for 'five' (*ɓe-tVk*) suggested in the table below is not immediately apparent and is thus debatable. Within this approach the element **ɓe-** is interpreted as a noun class prefix despite the fact that such a class is lacking in Sereer. Complex issues pertaining to the reconstruction of the term for 'five' will not be treated here. We shall only note that the plural animate class is reconstructable as **ɓe-** (class 2) in Proto-Fula-Sereer.

#### **4.12.1.6 Wolof**



#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

The Wolof term for 'one' exhibits the agreement in noun class, cf. *k-enn nit* 'one person', *g-enn garab* 'one tree', *f-enn* 'somewhere', *l-enn* 'something', etc. The same can be applied to the terms covering the sequence from 'two' to 'four' as demonstrated in Pozdniakov 2015: 82. Nothing is known about the original radical of the root (assuming there was one) since it was replaced by a noun class consonant.

Speaking of 'twenty', it should be said that the form *nit(t)* (apparently related to the lexical root *nit* 'person') is widely used alongside the common Wolof pattern '10\*2'.

#### **4.12.1.7 Nalu-Baga Fore-Baga Mboteni**

This sub-group is the most problematic within Northern Atlantic. Admittedly, the evidence pertaining to their classification as Northern is inconclusive. Moreover, the sub-group itself is highly heterogeneous, which affects its numeral systems as well. The pertinent data for each of these languages is provided below (Table 4.230).


Table 4.230: Numerals in Nalu, Baga Fore and Baga Mboteni

4.12 Atlantic

#### **4.12.1.8 Proto-Atlantic North**

The prospects for the reconstruction of the Proto-North Atlantic numerals are discussed below.

4.12.1.8.1 'One' (Table 4.231)


Table 4.231: Numerals for '1' in Northern Atlantic

Isolated forms are quoted in the rightmost column. Direct parallels to some other forms are attested in Cangin – Buy (*nɔʔ*) and Konyagi – Baga Mboteni (*mbɔ*). The most common root is *\*di(n)/ li(n)/ ye(n)/ ne(n)* (assuming that these forms are related).

4.12.1.8.2 'Two', 'Three' and 'Four' (Table 4.232)

Table 4.232: Numerals for '2'-'4' in Northern Atlantic


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

The forms of 'two' in Tenda-Jaad-Biafada can be explained as a shared innovation, since these two branches belong to the same sub-group. The forms quoted in the two leftmost columns could be related, but the pertinent evidence is inconclusive. The roots *\*nak* and *\*di(k)* are reserved for further comparison.

As in the majority of other NC branches, the terms for 'three' and 'four' (tentatively recorded as *\*taʈ* '3' and *\*nak* '4') are fairly consistent in North Atlantic. Thus it appears that the terms for 'two' and 'four' are the same (or phonetically similar) across the languages of this branch. Cangin is the only language that does not comply with the additional distribution, because in the case of Cangin both terms are reconstructed as \**nak*. Interestingly, the form of 'four' bears a suffix, hence it could potentially be explained as a derivative of 'two'. At the same time, the root *nak* 'four' is reminiscent of one of the most persistent NC roots with this meaning.

In Jaad-Biafada we find the root *\*jow/caw* '3'. This is undoubtedly an innovation in the group which is represented by a remarkable isogloss. This is therefore an argument in favour of interpreting this group as part of the northern branch of the Atlantic family: Biafada -*njo / bíí-co/ bií-yo* '3', Jaad *ma-cao/ macaw/ má-cɔu* '3'. It is possible that we are dealing with an ancient borrowing of Proto-Jaad-Biafada from Mande (from *saba* 'three').

In theory, it is possible that forms attested in the Cangin languages (*ka-hay / \* ʔe-jɛʔ*), also originated from the Mande form (likely weakened to *\*habi / hawi*).

In this case, we find either reflexes of the Proto-NC form *\*tath* or borrowings (taking into account very ancient forms) – from the Mande languages in numerous Northern Atlantic languages.

#### 4.12.1.8.3 'Four'

The root *\*na(h)i-k* can be securely reconstructed for Proto-Northern Atlantic. As has been demonstrated above, the initial **ñ-** of the Wolof term is a reflex of a noun class prefix that replaced the initial radical of the root. The final -t in the Wolof term probably resulted from the alignment by analogy with the term for 'three' that ends in -t, cf. \**ñ-eenk* ? → *ñ-eent* '4' by analogy with *ñ-ett* '3'.

#### 4.12.1.8.4 'Five' (Table 4.233) and the terms from 'six' to 'nine'

The North Atlantic languages are characterized by the term for 'five' being systematically derived from the lexical root meaning 'hand'. Interestingly, this development seems to post-date the replacement of the original root for 'hand' by


Table 4.233: Numerals for '5' in Northern Atlantic

an innovation in the majority of the branches. At least four independent formations of this kind are attested within eight branches (cf. the evidence quoted in the leftmost column of the table). Both Tenda and Jaad-Biafada terms for 'five' are of common ancestry: they seem to have developed from the root \**ɓəda* at the Proto-Jaad-Biafada level, since both languages belong to the same sub-group. This probably indicates that the pattern based on the term for 'hand' was used in the languages that belong to the Northern group at the proto-level (possibly as an alternative to the inherent NC root for 'five'). In view of this, the formal alterations of 'five' are easily explained as those automatically caused by the replacement of the inherent term for 'hand' by an innovation. As we hope to demonstrate in the next chapter, the derivational pattern 'hand' > 'five' is surprisingly rare in the NC languages. It is barely attested, for example, in Benue-Congo, thus being characteristic of the North Atlantic languages (and the Atlantic languages on the whole, see below).

In view of this, the reflexes of the inherent NC root for 'five' could have been preserved in only a minority of North Atlantic branches. The roots *\*jo/ co*, *\*tVk/ rog* and *\*rib/ ʔiːp* unrelated to the term for 'hand' deserve special attention within this context.

The pattern '5+' ('hand'+) can be securely reconstructed for the terms covering the sequence from 'six' to 'nine'. The uncommon pattern '7=6+1' attested in Biafada was borrowed from one of the Manjak languages (Atlantic Bak), as was the derived term for 'six' (*mpaaji*).

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

#### 4.12.1.8.5 'Ten' and 'Twenty' (Table 4.234)


Table 4.234: Numerals and patterns for '10' and '20' in Northern Atlantic

With the evidence of the three branches, the reconstruction of the term for 'ten' (tentatively recorded as \**pok*) seems secure. Its attestations are admittedly limited, apparently due to its replacement with derived terms based on 'five' ('hand'). This reconstruction is also supported by the presence of the final velar: as we have seen, it is reconstructable in a number of other numerical terms at the proto-level.

The pattern for 'twenty' is reconstructable as '20=10\*2'. Particular derivates based on the typologically widely attested patterns ('20' <'person', 20 <'king') seem to have formed independently.

#### 4.12.1.8.6 'Hundred' and 'thousand'

The evidence points to the absence of these terms in Proto-North Atlantic. Attested forms are borrowings from 'influential' languages such as Fula, Wolof, Manding, Hausa (in the case of Niger Fulfulde). Interestingly, the terms in question are already borrowings in some of these source-languages.

4.12 Atlantic

#### 4.12.1.8.7 Proto-North Atlantic numeral system (Table 4.235)


Table 4.235: Proto-North Atlantic numeral system (\*)

### **4.12.2 Bak**

### **4.12.2.1 Joola languages**

Over a hundred sources covering the numeral systems of fifteen major Joola dialects have been made available to us courtesy of Guillaume Segerer. His collection of evidence may be labeled a 'dialect atlas' of numerical terms. These terms often exhibit significant variations not only in their phonetics but in the inventory of lexical roots as well.<sup>36</sup> The name Joola pertains to a group of at least seven related languages (including Bayot). A study of their numeral systems may help set a clearer distinction between these languages. Moreover, it might shed some light on their (hitherto unclear) internal classification.

Numerical terms as attested in ten major Joola languages are discussed below.

#### 4.12.2.1.1 'One' (Table 4.236)

The main form is reconstructed as \*-*anor*, with the initial vowel forming a part of the root. The only languages where this root is not present are Bayot (*don* '1') and Kwaatay (*fɛnɛŋ* '1'). The root *əkon* with a vocalic opening (sporadically attested in Kasa and Bayot) is found in Fogny alongside \*-*anor*.

4.12.2.1.2 'Two', 'three' and 'four' (Table 4.237)

Two alternative roots for 'two' are attested in Joola, namely *\*si-ɬubəʔ* and a relatively wide-spread *\*si-gabaʔ*.

<sup>36</sup>I wish to express my gratitude to G. Segerer for his assistance with regard to the dialectal attribution of sources.


Table 4.236: Joola numerals for '1'

Table 4.237: Joola numerals for '2'-'4'


#### 4.12 Atlantic

The term for 'three' goes back to *\*si-feeɡir*, with its reflexes being attested in all dialects.

The term for 'four' is securely reconstructed as \**si-bääkiɽ*.

4.12.2.1.3 'Five' and 'ten' (Table 4.238)


Table 4.238: Joola numerals for '5' and '10'

'hands'

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

The Banjal form \**tən* (reconstructed on the basis of the compound numerical terms) and the (related?) Fogny form *fu-tam* attested in a source dating to the seventeenth century (d'Avezac 1845) are of special interest.

The Mlomp form of 'five' (sporadically attested in Kasa as well) is identical to the Karon form for 'ten' (*ŋaa-suwan* in both cases). The etymology of these forms is unclear. At the same time, the majority of the forms for 'ten' (but not for 'five' as in the majority of the North Atlantic languages) go back to the lexical root meaning 'hands'. To illustrate this point, the lexical stems for 'hand' in the Joola languages are quoted in the table (Table 4.239).


Table 4.239: Joola stems for 'hand'

#### 4.12 Atlantic

As can be deduced from the presentation above, at least four lexical roots for 'hand' that serve as a basis for the terms for 'ten' are distinguishable in Joola. Interestingly, the source roots and the numerical terms that depend on them are not necessarily the same within a language. The main root is *\*ku-ŋɛn/ ku-ɲɛn* '10' <'hands'. At the same time, *bɛɛs* 'hand' yields *sɛ-bɛɛs* 'ten' in Mlomp. This derivative is not attested in in Kasa and Karon where *bɛɛs* 'hand' alternates with *ŋɛn/ ɲɛn* 'hand'. The base *\*ka-ʈe* 'hand' attested in Bayot and Kasa yields *gu-tie*in Bayot. Finally, *ɛ-mɔŋu* 'hand' > *su-moŋu* 'ten' in Kwaatay (also *ɛ-ŋɔmu* 'hand' > *su-ŋɔmu* 'ten' with a metathesis).

As noted above, the root *ɛ-ntaaja* attested in Keeraak and Ejamat was possibly incorporated into Kobiana (North Atlantic). This root, admittedly very rare in the Joola cluster, is the only primary one for 'ten' and as such it deserves special attention (especially in view of its later replacement with the derivatives based on 'hand').

#### 4.12.2.1.4 'Twenty', 'hundred', and 'thousand'

Two apparent derivational patterns are used for the term for 'twenty' in the Joola languages:

<'king': Bliss *a-yɩɩy*, Banjal *ə-vi/ə-vvi*, Kasa *a-yi/ ɔ-ji*, Karon *əwi*, Bayot *ə-y*;

<'person': Kasa *an / bu-k-an*, Fogny *ka-banan* 'person finished'.

In Kwaatay the term for 'twenty' is based on 'mouth' (*bu-tum-an*).

The terms for 'hundred' and 'thousand' are borrowings from Mande or 'influential' Atlantic languages (often either Fula or Wolof) in the majority of the dialects, cf. *keme/teme* '100', *wuli, juni* '1000'.

In conclusion it should be added that the Joola terms covering the sequence from 'six' to 'nine' follow the common pattern '5+'.

### **4.12.2.2 Manjak languages**

This branch is represented by three closely related languages (Manjak, Mankanya, Pepel). Numerical terms attested in them are presented in the table below (Table 4.240).

As can be gleaned from the table, the Manjak stems for numerals are very different from those attested in Joola. At the same time, morphological and lexical evidence strongly suggests that these two branches are genetically the closest and belong to the same Bak sub-group.

4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.240: Manjak numerals

This implies that the numeral system of one of these branches must have undergone systematic innovations. We will reserve our conclusions until the evidence from the other Bak sub-groups, i.e. Balant and Bijogo, is reviewed.

#### **4.12.2.3 Balant**

Despite the fact that Balant is usually treated as one language, we will present the evidence of Balant Ganja and Balant Kentohe separately (Table 4.241), because the difference between these two idioms is of key importance to our study.

The opening sequence of the Ganja terms is quoted according to Creissels & Biaye 2015. They form the most reliable part of the presentation. A few remarks pertaining to the differences in these Balant dialects are in order. First of all, the Balant Kentohe terms for 'one', 'two', 'three' and 'six' exhibit a final homorganic nasal of uncertain origin. The forms attested by Koelle in the 19th century sources suggest that we are dealing with a morpheme **-n** not assimilated to a preceeding consonant by point of articulation. Secondly, Koelle's evidence speaks in favor of 'six' being a base for a larger group of numerical terms. According to him, not only 'eight' and 'nine' but also 'ten' followed the pattern '6+'.

### **4.12.2.4 Bijogo**

Let us examine an analysis of the Bijogo numeral system found in (Segerer 2002). According to him, the term for 'one' is *nɔɔd* ("cette forme est retenue pour l'énumération abstraite", ibid. 171). His interpretation of \*-**d** as the only true reflex of the etymon (with other segments ensuring the grammatical agreement) is immediately convincing, cf. the following examples quoted by him (ibid. 171):

4.12 Atlantic


Table 4.241: Balant numerals

	- b. *e-booʈi ɛ-nɛɛd* 'a dog'
	- c. *u-gbe u-nɛɛd* 'a road'
	- d. *ka-jɔkɔ n-ka-d* 'a house'
	- e. *ŋɔ-katɔ ŋ-ŋɔ-d* 'a fish'.

Segerer justly observes that *'*La forme générale de l'élément ayant pour valeur 'un (autre)' est donc **(V)-n-pC-d**, où **pC** est le préfixe de classe du nom déterminé*'* (ibid. 171).

He also quotes the form *dideeki* 'seul' (var. *deeki* 'tout seul'). A variant of this form probably appears as *èɖìgɛ́/ néédigɛ/ módiigɛ* 'one' in Wilson and Koelle.

As demonstrated by Segerer, the term for 'three' (*ɲ-ɲɔɔkɔ*) is a Bijogo innovation of a cultural origin, cf. sg *ɲɔ-ɔkɔ* - pl of *nɔ-ɔkɔ* 'finger' (dim. <*kɔ-ɔkɔ* 'hand'):

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo


Table 4.242: Bijogo numerals

'Un roi bijogo ne se déplace jamais sans l'attribut symbolique de sa fonction, consitué par une sculpture de bois et de corne … Cet objet, nommé u-ran kɔ-ɔkɔ, represente une main à trois doigts' (ibid. 172). It should be noted that this root is attested in all Bijogo dialects and is already accounted for by Koelle (*-ɲɔ́ɔ́gɔ*).

As established by Segerer, the same root is attested as *ɔkɔ* in the terms for 'five' and 'ten'.

#### **4.12.2.5 Proto-Bak**

Now we will compare the Bak numerals.

4.12.2.5.1 'One' (Table 4.243)

A comparison of the terms quoted in the leftmost column yields the form that can be tentatively recorded as *\*don*. The rightmost column gives an overview of roots attested in only one out of four branches.

4.12 Atlantic


Table 4.243: Bak numerals for '1'

#### 4.12.2.5.2 'Two' (Table 4.244)

Table 4.244: Bak numerals for '2'


The leftmost column presents the root attested in three sub-groups. It is traceable to *\*ɬubəʔ.*

4.12.2.5.3 'Three' and 'four' (Table 4.245)

Table 4.245: Bak numerals for '3' and '4'


For the first time in our step-by-step analysis of numeral systems in the numerous NC families we observe the existence of a separate root for 'three' in each of the branches of a language group.

The term for 'four' exhibits an isolated Joola-Manjak innovation as well as isolated innovations in Balant and Bijogo.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

4.12.2.5.4 'Five' (Table 4.246)

Table 4.246: Bak numerals for '5'


The pattern 'hand' > '5' is traceable within two branches. However, the roots involved are different in each case. Numerous isolated forms are grouped together in the rightmost column.

4.12.2.5.5 The terms from 'six' to 'nine' (Table 4.247)

Table 4.247: Bak numerals and patterns for '6'-'9'


The form *\*paag/paaj* 'six' is a common Manjak-Balant isogloss.<sup>37</sup> It is not surprising that the primary term for 'six' attested in these languages served as the basis for the '7=6+1' pattern. This pattern received further development in Balant where it was employed for terms up to 'ten' (i.e. '10=6+4') according to the 19th century sources. At the same time, the archaic pattern '8=4PL'/'8=4 redupl.' is attested in these languages alongside the pattern '8=6+2'.

<sup>37</sup>Guillaume Segerer is right to note (p.c.) that the Manjak-Balant form \**paag-* '6' may be ralated to Joola \*-*feeɡir*/-*həəji* '3'

4.12 Atlantic

#### 4.12.2.5.6 'Ten' (Table 4.248)


Table 4.248: Bak numerals for '10'

In addition to the common pattern '10 = 'hands'', both branches share a common root (*ntaaja*) that could be interpreted as a shared Proto-Joola-Manjak innovation.

#### 4.12.2.5.7 'Twenty', 'hundred' and 'thousand'

The term for 'twenty' is based on the lexical root meaning 'person' in all of the branches (except for Manjak, where it was replaced with the pattern '20=10\*2'). The same development is observable in Balant Ganja as well.

The terms for 'hundred' and 'thousand' are most likely borrowings. However, the origin of *kont*/*kunt* 'thousand' attested in three of the Bak branches deserves special discussion (in North Atlantic this root (*ŋ-kontu*) is found in both of the Buy languages).

<sup>38</sup>The stem is attested only in Joola Feloup, so, it seems to be borrowed from Manjak.

#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

4.12.2.5.8 Overview of the Bak numerical terms (Table 4.249)


Table 4.249: Bak numerals

### **4.12.3 North Atlantic and Bak Atlantic numerals in the comparative perspective**

It should be stressed that the Atlantic family is among the most divergent within Niger-Congo. Some of the numerical terms in both of the Atlantic groups exhibit a variety of forms potentially explained as Proto-NC reflexes. Moreover, the comparative evidence presented in Tables 4.227 (Proto-North-Atlantic) and 4.241 (Proto-Bak-Atlantic) points to the near total absence of common roots present in both groups. The only exception to this is the root *tɔk/ tVk* 'five'.

In view of this, the only available solution would be the study of the Atlantic evidence within a wider NC context (i.e. in contrast to the reconstructions available for other NC families). A comparison of the intermediate reconstructions within the macro-family will be offered in the next chapter.

### **4.13 Isolated languages vs. Atlantic and Mel**

According to the traditional classification outlined in Sapir 1971, Limba, Sua and Gola belong to the Atlantic languages. However, as we tried to demonstrate in Pozdniakov & Segerer 2017 (forthcoming) this hypothesis is as ill-grounded today as it was half a century ago.

An overview of the pertinent data for each language is presented in the tables below.

### **4.13.1 Sua**


### **4.13.2 Gola**

Table 4.251: Gola numerals


### **4.13.3 Limba**

Table 4.252: Limba numerals


#### 4 Step-by-step reconstruction of numerals in the branches of Niger-Congo

This chapter includes 250 tables presenting the evidence by group, branch or sometimes a dialect of a certain language. Among them are summary tables that provide an overview of the numerical terms in twelve major families of Niger-Congo and in a number of isolated languages. Our attempt at reconstructing the Proto-Niger-Congo numeral system on the basis of this comprehensive evidence will be presented in Chapter 5.

## **5 Reconstruction of numerals in Niger-Congo**

### **5.1 'One'**

The five stems present in Table 5.1 are the most likely candidates for the reconstruction of 'one' in NC (Table 5.1).

**Commentary.** The chart is used to demonstrate the distribution of roots across language families. It groups twelve families into five major branches, including Western NC (Atlantic, Mel), Northwestern NC (Dogon, Gur, Mande), Northern NC (Ubangi, Adamawa), Southern NC (Kru, Kwa, Ijo, BC), and Eastern NC (Kordofanian).

It should be stressed that this grouping has no implication for the genealogical classification of the NC languages and merely serves as convenient means of display for the isoglosses that will hopefully help to adjust the existing classification.

The chart demonstrates a variety of possible reconstructions. However, some positive knowledge can be gleaned from it. First of all, it should be stressed that a step-by-step analysis of the forms for 'one' attested in the families and branches of NC strongly suggests that no other candidates, except for those displayed in the chart above, can be reconstructed. It should also be noted that the reconstruction of a tri- or even disyllabic root on the basis of the available evidence seems highly improbable, since all potentially reconstructible roots are monosyllabic. Moreover, the inventory of these roots is limited and merits special discussion. Such a discussion is essential, since many of the quasi-reconstructions presented above are not immediately apparent. The problems pertaining to the reconstruction of these roots were to some extent treated in the previous chapter. What follows is a brief survey of the basic facts.

**The root** *\*di***.** This well-known root has received much scholarly attention as the major candidate for the reconstruction of 'one'. It is manifestly absent only

#### 5 Reconstruction of numerals in Niger-Congo


#### Table 5.1: Niger-Congo stems for '1'

#### 5.2 'Two'

in Kru, Mande and Dogon. In addition to the families listed above, this root is also attested in the Laal language isolate (*ɓɨdɨ ̀ l (ɓɨ ́ -dɨ ̀ l?) ́* '1'). It is absent in the Sua, Gola and Limba isolates. It bears reminding that the reconstruction of this root in Benue-Congo and Bantu is only possible under the assumption that PB *mòdì < \*m-ò-dì* '1' (with **m-** being a Proto-Bantu cl1, and **-o-** being an archaic noun class marker (possibly < **\*ko-/\*ʔo-**, i.e. NC class cl1 incorporated into the stem).

**The root** *\*in***.** Although this root is not attested outside Western NC, BC and possibly Adamawa, it is worth mentioning, especially in view of its possible etymological relationship with \**di* (see above).

**The root** *\*do***.** The same is applicable to \**do* (best attested in Northern NC, Atlantic and Kru).

**The root** *\*ti***.** The reconstruction of \**ti* '1' is the least certain among the roots discussed above. The form *ha-nthe* '1' attested in the Limba language isolate is noteworthy.

**The root** *\*gbo, \*kpo***.** The last root is a tentative representation of the forms with the initial labio-velar (or labial in the case of Western NC) that are not necessarily etymologically related. The root *ɡuùŋ* '1' attested in the Gola isolate may belong here as well.

In addition to the five roots treated above, apparent innovations may be attested in particular families (or even in groups within them). Among these are Kordofanian *ʈɔn* (cf. Sua *sɔn*), Gur *túrú/tumɔ*, Mande West *kelen*, and Atlantic Bak -*anor, əkon*.

### **5.2 'Two'**

### **5.2.1 'Two'**

A systematic comparison of the terms for 'two' attested in the NC families yields somewhat unexpected results. The only candidate for the reconstruction of the NC term is the root that can be tentatively recorded as \**di*. However, nearly every family has its own root (or, more often, roots) for 'two' that finds no parallel outside the branch/family in question. The distribution of \**di*, as well as an overview of isolated roots, is presented in the chart below (Table 5.2).


Table 5.2: Niger-Congo stems for '2'

**Commentary.** The isolated forms are as follows: Laal *ʔīsī (ʔī-sī?)* (this root is comparable to that attested in Ubangi), Sua *cen*, Gola *tì-yèe/ tī-el/ cel* (the Gola and Sua terms may be related), Limba *ka-le/ kaa-ye* (this root may go back to NC *\*di*).

The unprecedented variety of forms exhibited by the term for 'two' is especially surprising because this notion has been viewed as one of the most persistent in language history (it is the only numeral on the Swadesh list). As we will see below, this term is the least stable in the Niger-Congo languages. However, the NC root \**di* is well-attested across the families.

### **5.2.2 'Two' = 'one' pl?**

As can be gleaned from the evidence presented above, the only root for 'two' reconstructible in NC (*\*-di*) is suspiciously similar to the most likely reconstruction for 'one' (*\*-di*). This similarity was first observed by Raymond Boyd, one of the most renowned experts in the reconstruction of Adamawa. Before we turn to the discussion of the most promising (in terms of the NC reconstruction) forms, an overview of Raymond Boyd's hypothesis regarding Adamawa and some of the BC languages is in order. Here is what Boyd writes about the reconstruction of 'one': "A rather complicated hypothesis would, in fact, cover most of the Cross

#### 5.2 'Two'

River/Platoid data: Let us assume a single root, \*DI (sometimes ~\*DU) and two affixes, (V)K(V) and (V)N(V), which can appear, separately or together, as either prefixes or suffixes, or both. <…> Some support for this hypothesis is provided by the frequently observed inversion of the coronal and velar features: in most cases, where we find a term with initial velar, we find a final coronal nasal; and where we find an initial coronal, we find a final velar nasal. This can be explained by assuming the prefixation of \*KV-N- in the former case, and suffixation of \*- N-K(V) in the latter." (Boyd 1989: 151–152). Boyd's proposal is to reconstruct the Proto-Adamawa terms for 'one' and 'two' as *\*n-di* and *\*bà-dí* (with class 2 prefix) respectively (Boyd 1989: 156). According to him, "It was suggested above that the Cross River/Platoid root for 'one' was \*DI. We may now hypothesize that the root for 'two' in the proto-language for these groups was the plural \*BA.DI, and that, when Proto-Bantu developed its more complicated class system, this term, whose prefix may have been invariable, was reinterpreted as mono-morphemic" (Boyd 1989: 157).

It should be stressed that Boyd's hypothesis explains the Proto-Bantu forms that underwent the following transformation over the course of time: *\*m* (cl1) *o*(<\*cl1)*-di* > \**mʊ̀-òdì / mòì* '1'/*ba*(cl2)*-di* > *badi* '2' (the dialectal Proto-Bantu form *jòdè* (zones BH) (< *\*jò(*cl5 ?)-*di*?)). It bears reminding that our evidence favors the reconstruction of *(o-)di(n)* '1'/*ba-di / ba-ji* '2' at the BC level.

One of the major problems with this reconstruction is that synchronically the roots for 'one' and 'two' are the same in only a minority of the modern NC languages. This rare phenomenon is attested in the Ngabaka branch of Ubangi (Table 5.3).


Table 5.3: The same stem in '1' and '2' (\**di*)

As stated above, examples of this kind are exceptionally rare. A possible explanation for the overwhelming absence of the identical roots for 'one' and 'two' is that one of the classes is subject to the nasalization process (entailing further phonetic changes within the root), while the other is not. It bears reminding that,

#### 5 Reconstruction of numerals in Niger-Congo

according to Boyd, a number of expanded forms such as *\*n-di* (with further development to *\*-ni / -in* 'one') is reconstructible along with *\*-di*.

In view of this, the Oti-Volta numbers, thoroughly discussed in the previous chapter, are especially interesting. The pertinent Oti-Volta forms are as follows (Table 5.4).

Table 5.4: Potential reflexes of *\*di* '1' = *\*di* '2' in Gur


The terms for 'one' and 'two' are similar within each of the branches, the differences between them being due to the presence of the nasal component in the term for 'one'.

### **5.3 'Three'**

As is well known, the term for 'three' is exceptionally persistent, with the same root attested in all of the major NC branches (except for Mande). The same root is also present in the Western NC isolates, cf. Sua *b-rar*, Gola *taai/tāāl*, Limba *ka-tati*. However, some languages exhibit what are apparently innovative forms (see the downmost segment of the chart). An isolated root is also attested in Laal (*māā* '3').

Although the relationship between the reflexes of the main root (\**tath*) is unquestionable, their phonetics pose a problem. The issue is that each family exhibits a great variety of reflexes, while some of them cannot be explained as going back to either the initial \***t**- or the final \*-**t** of the main root. In other words, reliable correspondences (with \***t** preserved) are traceable in the majority of families, but not in the case of 'three'. This forces us to assume that \*t may be irregularly reflected as **s**, **r**, **h** in particular families.

The table below (Table 5.6) provides an overview of the pertinent Bantu reflexes of *\*tátʊ̀* (ABEFGHJKLMNPRS)/*\*cátʊ̀*/*\*cácʊ̀* (CD) 'three' (these reconstructions follow BLR3).


Table 5.5: Niger-Congo stems for '3'

Table 5.6: Reflexes of *\*tátʊ̀* '3' in Bantu


#### 5 Reconstruction of numerals in Niger-Congo

The Bantu forms should be discussed in order to determine which processes in Bantu (and in Niger-Congo in general) give rise to such a diversity of phonetic variants.

The root includes two consonants. Putting aside the problem of the vowel in the second syllable, we label the two consonants C- and -C respectively. Each of them may be dropped, yielding the Bantu forms **ta** and **at** (Figure 5.1).

Each of them can be transformed, for example, with a spirantisation **\*t** > **s, or \*t** > **r, \*t** > **l**, can become voiced **\*t** > **d** and only after that can the second consonant be dropped. (Figures 5.2–5.3).

5.3 'Three'

As a result, we have numerous forms, while the variation can be reduced to a very limited number of processes:


Table 5.7 provides a structured overview of the derived Bantu forms (with no arrows).

Table 5.7: Phonetic variations of *\*tat-*


However, the resource for changes in Bantu is not limited to the above. The derivational schemes mentioned above are constructed not only on the basis of *tat*, but also from newly derived forms. For example, *\*tat* > *sat*, and others (Figure 5.4).

#### 5 Reconstruction of numerals in Niger-Congo

This is where the following forms (Table 5.8), many of which are attested in Bantu, originate (forms without square brackets).

Table 5.8: Reflexes of *\*tat-* attested in Bantu


We often do not know how one or another derived form appeared. For example, the form *las* in the first line of the table could have originated from *\*tas* (as a result of the change in the first consonant – the variation in the line) or from *\*lat* (the change of the second consonant – column). Many of the forms which are predicted theoretically are not attested in Bantu; these are shown in square brackets.

The most amazing observation here is not the high degree of variation (which itself needs to be considered), but the fact that we find precisely the same variations in different branches of NC. As a result, in different branches of NC—that is—in languages with distant genetic relations, we find numerous identical forms, while in every branch taken separately we find an "antimagnetic" landscape of forms, which in closely related languages tend to be maximally differentiated.

Examples from seven branches of NC are given below and divided into two structurally identical tables (Table 5.9–5.10).

5.3 'Three'

Figure 5.5

#### 5 Reconstruction of numerals in Niger-Congo


Table 5.9: Reflexes of *\*tat-* in Niger-Congo (1)

We see, for example, that roots **TAL** and **TAR** are observed in all seven branches.

To get a comprehensive idea of the presence of the forms in each branch we are attracting attention to the following chart, where the presence of the forms (at least in one language) is marked by a cross (the data is arranged in descending order in the summarising column as well as in the summary line) (Table 5.11).

The following chart represents the number of groups (within the 14 branches of Niger-Congo) presenting the respective combinations of the first (the line) and the second (the column) consonants (the data is presented in descending order) (Table 5.12).

As we can see, the most frequent consonants in the initial position are **t-** and **s***-*, while the second consonant is one of the following three: **-Ø**, **-t**, or **-r**.

If we reconstruct *\*tat-* on the NC level, in line with the majority of linguists, we will have to contend with quite a mysterious picture. In the majority of

5.3 'Three'


Table 5.10: Reflexes of *\*tat-* in Niger-Congo (2)

younger proto-languages we will also have to reconstruct *\*tat-,* because, as it has already been shown, it descends into more or less the same variation of forms. It means that during thousands of years, from Proto-NC to the formation of proto-languages in separate branches, the form remained phonetically unchanged. Then, suddenly the root *\*tat* independently started to explode, giving rise to much phonetic variation in its reflexes.

I think that a hypothesis stating that the root already contained close but not identical consonants in NC is far more typologically justified. The first consonant in that case was **\*t-**, while the second one was represented by a specific phoneme for which no traces remain, for example, *\*-***th** ?, **\*-ʈ** ?,**\*-ts** ˙ ?,**\*-c**? As we tried to show in (Pozdniakov & Segerer 2007), the phonotactics of many languages (not exclusively in Africa) demonstrates the same tendency: in CVC structures languages tend to avoid consonants constituting a minimal pair, for example, *fVp, bVp, sVz, lVr, rVl, sVʃ, etc*. In diachronic perspective, the existence of such combinations often leads to numerous irregular changes, in the course of which the consonants either become identical, for example, \**lVr* > *lVl,* or, on the contrary, acquire a higher level of contrast, escaping the zone of "dangerous proximity", for example, *\*sVsh* > *sVh, \*bVp* > *bVf*. In other words, similar sounds being adjacent to one another are a constant zone of tension which provokes all possible irregular changes.


Table 5.11: Distribution of different reflexes of *\*tat-* in the Niger-Congo families


Table 5.12: Number of different phonetic structures for '3' in 14 NC branches

It is very likely that such a situation characterises the NC root for 'three'. In this case, the considerable phonetic variability of the root in all the stages of its development from Proto-NC to contemporary languages can be typologically – phonotactically – explained.

### **5.4 'Four'**

Just like the term for 'three', the term for 'four' is exceptionally persistent in NC. It is represented by the same root in all the families (except for Mel and Kordofanian), as well as in the Western NC isolates, cf. Sua *b-nan*, Gola *tii-nàŋ*, Limba *ka-naŋ*. At the same time, a number of innovations are attested in some of the families (see the downmost segment of the chart) and in the Laal isolate, cf. *ɓiīsān* (*ɓī-sān*?) '4'.

This root is not present in Nilo-Saharan (including Songhai), nor in Afroasiatic or Khoisan. In light of this, the root can be viewed as one of the best isoglosses indicating the genetic relationship of languages within NC. Used together with the isogloss for 'three', it becomes a powerful means of classification, i.e. if the term for 'three' has (or goes back to) **t**- as the initial consonant in a given language, whereas the term for 'four' starts with **n**-, this language must belong to the Niger-Congo family. Hundreds of the NC languages match this description, while, as far as I am aware, none of the languages from other families meets these requirements.

#### 5 Reconstruction of numerals in Niger-Congo


Table 5.13: Niger-Congo stems for '4'

There will probably be no objection from the specialists in the field to the statement that the main root for 'four' begins with \***na**-, e.g. this form is reconstructed for Proto-Potou-Akanic-Bantu by John Stewart. However, many languages show that the root initially included two vowels, \***i** being the second of the two. The major issue, however, is establishing whether the root included another consonant (i.e. whether \**nai* or \**naCi* should be preferred) and if so, what it was. Stewart suggests *\*na~ŋi~* '4' as the Proto-Potou-Tano-Congo form (Stewart 1983), but his reconstruction is not applicable to NC.

However, the reconstruction of the proto-form for 'four' is not an easy task. The problem is that a given form does not define the languages it is attested in as members of the same group. Nearly every group has an inventory of phonetically similar forms (just like in case of 'three'). The Bantu languages may provide a good illustration for this phenomenon.

The most frequently attested Bantu forms include *na, nai, nayi, ne, nei* and *ni* (six in total). They are found in 276 of 355 Bantu sources that include a form for 'four' available in our database. Their zonal distribution is as follows (Table 5.14).


Table 5.14: Distribution of the main n- forms for '4' in Bantu zones

As can be gleaned from the table, the six forms discussed above are commonly attested in our sources stemming from zones as diverse as C, F, J, M, and S. For instance, pertinent forms are attested in 26 out of 27 sources available in our database for the J zone (the last source, namely the Luganda language, has *nya* 'four' that probably goes back to the same root).

The problem, however, is that this (or a nearly identical) set of forms is attested within the other NC families as well, cf. e.g. the Kwa evidence (Table 5.15).

Table 5.15: Main n- forms for '4' in Kwa


5.4 'Four'

#### 5 Reconstruction of numerals in Niger-Congo

The Adamawa evidence is as follows (Table 5.16).

Table 5.16: Main n- forms for '4' in Adamawa


My suggestion is that the variety of similar forms attested in the majority of the NC branches may be due to the complex inter-relationship between the terms for 'four' and 'eight' in NC. We will return to this hypothesis later, in the section dealing with 'eight'.

### **5.5 'Five'**

The term for 'five' is typically based on the lexical term for 'hand' in Mel and Atlantic. At the same time, the term for 'ten' is often derived from 'five' or, like 'five', directly from 'hand' in the plural. Multiple examples illustrating this phenomenon will be provided below. At this point I will limit myself to merely stating that the attestation of this pattern throughout the NC branches is inconsistent. Thus, it is virtually unattested in Bantu (as well as in BC on the whole). According to Nurse & Philippson 1975/1999, the Usseri dialect of Rombo (Bantu E) is a unique exception in this respect, cf. *ku-oko* 'hand' (Proto-Bantu *\*bókò*) yielding *ku-oko* ('5') and *ku-oko ka-vili* ('10', '5\*2'). At the same time, the reflexes of the Proto-Bantu roots for 'five' (*tanu*) and 'ten' (*i-kumi*) are attested in this language along with the irregular forms discussed above. These two patterns are barely attested in Kwa, Gur, Kru, or Ijo. On the contrary, they are common not only in Atlantic and Mel but also in Ubangi (Gbaya in particular), in some of the Adamawa languages, in a number of Kordofanian branches and possibly in Mande. In view of this distribution, the existence of these patterns in NC seems unlikely. Apparently, the terms for 'hand' should be considered when trying to establish the NC etymology for 'five' and 'ten'.

Our discussion will start with the unrelated roots for 'hand' and 'five' attested within the same branch. Then we will turn to the evidence of those groups where both terms go back to the root for 'hand'. This approach will allow the accumulation of data that will enable us to suggest a likely diachronic explanation for the phenomenon.


We will start with the Bantu evidence. The Bantu languages (like the majority of the NC groups in general) are characterized by the presence of multiple roots for 'hand' and 'arm'. The most persistent of these according to BLR3 are the following roots (Table 5.17).


Table 5.17: Distribution of the stems for 'hand', 'arm' in Bantu zones

I would like to stress that these roots are virtually unattested in Bantu with the meaning 'five' or 'ten'. According to BLR3, the only primary root for 'five' commonly attested in Bantu is *\*táànò*. In addition, the root *\*dòngò*, which probably goes back to *\*dòngò* 'line, row' (zones: ABCDEGHJKLMNRS) deserves our attention as well.

The initial consonant in *\*táànò* is the same as in *\*tátʊ̀* 'three', which is probably a coincidence. However, this fact can still be used for establishing the genetic relationship of the NC forms for 'five'. The possibility that the languages (or language groups) are related to the reconstructed Bantu forms is stronger if the terms for 'three' and 'five' attested in them have the same initial consonant. The following Bantu evidence (Table 5.18) is illustrative of this admittedly unconventional approach (further BC evidence will be quoted later in this chapter).

This rule is irreversible, i.e. the diversity of the initial consonants is not indicative of either form not being a Proto-Bantu reflex (Table 5.19).

The fact that the same consonants are reflected differently may have several explanations, e.g. that the noun class prefixes (especially the nasal marker of class 9) may have impacted the process. A number of other phonotactic factors may also be involved (some of which are treated in detail in the section dealing with 'three').


Table 5.18: Identical initial consonants in '3' and '5' in Bantu

Table 5.19: Different initial consonants in '3' and '5' in Bantu



The pairs of BC terms with the same initial consonant attested outside Bantu will be our primary concern in further discussion. Some of them are quoted in the table below (Table 5.20). As can be gleaned from the table, the root *\*tanV / \*taVn*


Table 5.20: Identical initial consonants in '3' and '5' in Benue-Congo

#### 5 Reconstruction of numerals in Niger-Congo

is systematically attested in nearly every BC branch, hence its reconstruction at the Proto-BC level seems certain. Moreover, it is widely attested in many other NC branches as well. The following forms of 'three' and 'five' (with the same initial consonant) are comparable to \*BC root (Table 5.21).


Table 5.21: Identical initial consonants in '3' and '5' in Niger-Congo

The Table 5.21 shows peculiar forms attested in one of the Southern Mel languages (Bom) that are virtually identical to the BC reconstructions. Thus, we have every reason to reconstruct the term for 'five' as \**tan* (unrelated to 'hand') at the NC level. The distribution of this root is illustrated in the following chart (Table 5.22).

Table 5.22: \**tan* '5' in Niger-Congo


<sup>1</sup>Elugbe 1987.

#### 5.5 'Five'

The attestations of this root in Southern NC (namely in BC, Kwa and Ijo) are more systematic. In Western NC the root is reliably attested as well, despite the fact that the Northern Mel form *kə-ʈamaʈ* allows a two-fold interpretation (i.e. as a derivative of either *ʈam*- or \**kə-ʈa* 'hand').

The Bom form is a direct reflex of *tan* 'five'. It bears reminding that the final velar in the Northern-Atlantic forms is regular. In the Gur languages, the pertinent form is attested in particular branches only. As attested in Western Mande, the form implies a semantic innovation, i.e. \*'5' > '10'. The relationship of the Kordofanian forms is not immediately apparent.

The distribution of the alternative reconstructible root \**nu*/*nun* is described in the chart below (Table 5.23).


Table 5.23: \**nun* '5' in Niger-Congo

A comparison to Kru implies the labialization of dentals in the vicinity of a back vowel. As the Dogon and Gur evidence suggests, the root is possibly derived from the term for 'hand'. In Dogon the forms of 'five' and 'hand' differ in all languages/sources. Interestingly, the term that means 'five' in one Dogon language may be used with the meaning 'hand' in another (and vice versa, see Hochstetler et al. 2004, cf. the following evidence (Table 5.24).

In light of this, the fact that, according to some sources, similar distribution of the same root is attested in a number of Gur languages is intriguing, cf. e.g. the following data (Table 5.25).

This raises the question, are we dealing with direct Dogon-Gur contact or with the reflexes of an additional NC root for 'hand'? The following roots may be considered potential correspondences: Proto-Bantu *\*nàmà* 'limb: arm; leg; thigh' (Regions 4: NW SW Ce NE ; Zones 6: ABEHMR) or *\*nʊ̀è* 'finger, toe' (Regions 5: NW SW Ce NE SE; Zones 9: ADJKLMPRS), (сf. Bantu, zones MN – Nyiha-Malila-Lambya Nurse & Philippson 1975/1999) *i-nyove*, cf. (Koelle 1963[1854]) Aku (Defoid) *ɲɔwɔ* 'hand'. The Bak (Atlantic) root *ñen* 'hand', 'five' discussed above may

#### 5 Reconstruction of numerals in Niger-Congo


Table 5.24: 'Hand' and '5' in Dogon

Table 5.25: 'Hand' and potential reflexes of *nun* '5' in Gur


belong here as well. The Gola root *nɔ̀ɔ̀nɔ̀ŋ* should also be mentioned here. The meaning 'hand' is not attested for this root in Kwa and Adamawa.

The following Atlantic roots attest to the semantic development of 'five' (and consequently 'ten') < 'hand' (Table 5.26).

This data is especially interesting in view of the BC evidence discussed above. As we have seen, the phenomenon of 'five' and 'ten' being based on the term for 'hand' is attested in both Atlantic groups (Bak and Northern). Moreover, this pattern is observable in a wide variety of roots with the meaning 'hand' attested in the languages under study (e.g. five roots with this meaning are attested in eight languages represented in the table above; the derivation pattern is the same in each case). In view of this, it is not surprising that the reconstructed NC root is not traceable in Atlantic.



Table 5.26: 'Hand' > '5' in Atlantic

The same pattern is also attested in the Northern Mel languages (that are in contact with Bak) for 'five' (but not for 'ten'), cf. (Table 5.27).

Table 5.27: 'Hand' > '5' in Northern Mel


However, we may be dealing with the secondary alignment of the terms for 'hand' and 'five'. The pattern CV-stem-VC (with CV- and -VC being a noun class prefix and suffix respectively) is characteristic of this language group, e.g. the Temne form may go back to *ta-m-ath* with the lexical root *\*-mV-* as its base. This pattern could also explain the similarity between the Temne terms for 'five' and

#### 5 Reconstruction of numerals in Niger-Congo

'ten': in this language *tɔfɔ́t* '10' probably goes back to *tɔ-f-ɔ́t* and hence to the NC root *\*fu* '10'.

Some of the Atlantic languages (e.g. various Joola and probably Proto-Joola as well) developed a separate root for 'five', while the term for 'ten' still remained a derivative of 'hand'. As expected, this root corresponds to Southern NC *\*tan/ ton* '5' discussed above (Proto-Atlantic: *\*tok* 'five': Kasanga-Kobiana *ju-roog*, Sereer *ɓe-tak / ɓe-tuk / ɓe-tik* (cf. also Limba *bi-so* ¯ *hi* ; Sua *sungun*), cf. Table 5.28.


Table 5.28: 'Hand' > '10' in Joola (Atlantic: Bak)

The etymological link between the terms for 'five' and 'ten' and their source ('hand') is not always explicit, e.g. different roots for 'hand' are attested in some of the sources for Mankanya-Manjak (Atlantic) and Temne (Mel), along with the derived form for 'five'. Such innovations are quoted in bold in the table below (Table 5.29).

Some of the forms of the term for 'five' go back to the root \**ko* in a number of the Ubangi languages (and possibly in some of the Mande languages as well, see Chapter 4 for details). Here we may be dealing with a NC root, cf. e.g. 'hand': Proto-Gbaya *kɔ* ˜ *́*, Proto-South Mande *kɔ̏*, Proto-Eastern Mande gɔn (?), Dida (Kru) *kɔ*, etc. *̄*

The following Kordofanian terms that attest to the development of 'hand' > '5' are also noteworthy: Dagik (Kordofanian) *si-s-ɜlːʊ* '5' (litː 'one hand'): "The *si* in 5 comes from the word 'hand'. So 5 is 'one hand'",<sup>2</sup> Acheron *zəɡuŋ zulluk* (lit: 'one hand' ): "The number 'five' is literally 'one hand': *zəguŋ* = 'hand', *z-ulluk* = 'one'".<sup>3</sup>

<sup>2</sup> John Vanderelst, https://mpi-lingweb.shh.mpg.de/numeral/Dagik.htm

<sup>3</sup>Russell Norton, https://mpi-lingweb.shh.mpg.de/numeral/Acheron.htm



Table 5.29: 'hand' > '5'/'10' in some Atlantic and Mel languages

To summarize, the primary root for 'five' (\**tan*) probably existed in Proto-NC. Over time it was independently replaced with the derivatives of 'hand' in some branches and various languages. In turn, the original term for 'hand' was replaced with innovations (with the term for 'five' in particular) in a number of languages, cf. Atlantic *rib/ ʔiːp*, Mel *wan/wen*, Mande *dúuru/ sɔ́ɔ́ru*, Kru *gbə / gbo*, Gur *mwan/ bwa*, Ubangi *du(w)/ lu(w)*, Kordofanian *ŋer-/ ɲer*-. As a rule, these innovations (not quoted here exhaustively) are only attested in particular branches of the families under study.

### **5.6 'Six'**

The explicit pattern '6=5+1' is present in the vast majority of the families. Primary terms for 'six' are attested in some of the NC families (or, more precisely, in their particular branches). However, they cannot be reconstructed at the NC level (see Chapter 4 for their detailed treatment). Selected forms of this kind include Atlantic *paag/paaj* ('7=6+1'), Kwa *golo / kolo, kua, ciɛ* ('7=6+1'), Adamawa *jup, gu*, Ubangi *zala/ zya*, Dogon *kuro/ kule*, Gur *do(b)*, Mande *t(s)um*? (the examples are quoted by family without further detail). The pattern '6=3 redupl.' is rarely attested. It is found in BC (possibly as a Proto-BC innovation attested in Bantoid, Cross, Edoid, Kainji?, and Platoid) and Kordofanian only.

#### 5 Reconstruction of numerals in Niger-Congo

### **5.7 'Seven'**

The main pattern is '7=5+2' (or '7=X+2' if the term for 'five' is replaced with an innovation). Primary roots are rare, being attested in BC (Defoid *\*byē* (cf. Edoid *ghie?*), Idomoid *renyi* (cf., however, Ikaan *h-ránèʃì* ('6+1')), Adamawa (*bir/ bil*, *rɪŋ, nbutu*), Ubangi (*sílànā, lɵ̀-rɵzi*), Dogon (*suli/ soli/ soye*), Gur (*pɛ*(*n*)) and Atlantic Bak (*jand/ jaanʔ/ cand* (Pepel)).

The rare patterns of '7=6+1' and '7=4+3' are limited to Atlantic Bak, Kwa, BC Platoid, and Kordofanian.

### **5.8 'Eight' ('four' and 'eight')**

In the majority of the NC families the term for 'eight' is historically based on the term for 'four' (with the exception of Mel, Kru, Dogon, Mande and Western NC isolates).

The pattern '8=4+4' is normally implemented via the reduplication of the root for '4'. In some cases an 'entire' reduplication (affecting the conjunction and the noun class marker) is employed (Table 5.30).

The reduplication can also be 'partial' (as a rule the reduction of the first syllable is involved), cf. Table 5.31.

This pattern can also be used when the original root for 'four' is replaced by another one, cf. the Balant (Bak) evidence: *tahla* '4' ~ *ta-ta(h)la* '8'. The same is observable in Yungur (and possibly in Burak (Adamawa)), cf. *net* '4' ~ *nat-at* '8' (Boyd 1989).

Sometimes 'eight' is derived from 'four' not via the reduplication, but by means of a simple replacement of cl.sg with cl.pl (or by adding the Pl. marker), cf. Table 5.32.

In Dii (Adamawa-Duru) a step-by-step replacement of classes is used as a derivation mechanism, i.e. '2' > '4' > '8': *i-dú* '2' > *nda-ddʉ́*'4' > *ka-ʔa-nda-ddʉ́* '8'.

A rare pattern is '8=4\*2', with the direct involvement of the term for 'two', cf. Viemo (Gur) *jumĩ* '4', *niinĩ* '2', *jumĩ-jɔ niinĩ* '8'.

When considering the reconstruction of 'four', it should be noted that if the term for 'four' (on which a reduplicated term for 'eight' is based) has any vowel other than [a] (typically [e] or [i]), the reduplicated form either preserves the vowel present in 'four' or has [a] in the first syllable. This mechanism is confirmed at least in the case of Bantu (Table 5.33).


Table 5.30: '8' < '4+4' (entire reduplication)


Table 5.31: '8' < '4+4' (partial reduplication)

Table 5.32: '8' = 4PL



Table 5.33: ne/*ni* '4' ~ *nane/ nani* '8' ( Bantu)

#### 5 Reconstruction of numerals in Niger-Congo

The latter fact leads to at least two conclusions: 1) the reduplication mechanism was used to derive 'eight' from 'four' at the Proto-Bantu level; 2) [a] that which is preserved in 'eight' should be reconstructed in the first syllable of 'four', where it was lost.

Moreover, there is a considerable body of Bantu examples of a Proto-Bantu root being preserved in the reduplicated term for 'eight', but lost in the term for 'four' (Table 5.34).


Table 5.34: '8' < '4' ~ '4' is lost (Bantu)

One of the factors that could explain the emergence of the second nasal in the term for 'four' is the alignment of 'four' and 'eight' by analogy, followed either by the replacement of the term for 'eight' with a composite term ('5+3' or '10–2', see Table 5.35) or with an innovation (Table 5.36).

The evidence presented above strongly suggests that the pattern '8=4 redupl.' was already in use at the Proto-NC level.

It should be noted that in those languages where this reduplication mechanism (or the pattern '8=4PL') is observable most clearly, another pattern is often used along with '8=4+4', namely '6=3+3' (or '6=3PL) (Table 5.37).

As expected, numerous languages that belong to different families exhibit a variety of patterns that are reused along with the one discussed above (including the general pattern '8=5+3' as well as '8=10–2' and even '8=6+2'). It seems, however, that such a wide distribution of this pattern ('8=4 redupl.') within the NC languages is genetic rather than typological.


Table 5.35: '8=4+4' > '8=5+3'

Table 5.36: '8=4+4' > '8' innovated



Table 5.37: '8' < '4', '6' < '3'

Primary roots for 'eight' are also attested. However, their attestations are usually limited to one or two families or to particular branches within a family, cf. e.g. '8' in Defoid (BC) *\*jo/ ro* (cf. in Kainji *ro/ ru*), Kwa *kwe/ kye*, Kordofanian *bɔ, ʈəŋi-*, Mande *seki/ segi*, Dogon *sele/ sagi* (< Mande ?), *gá(a)rà*, Atlantic Bak *\*ʊʌs*-. These forms (as well as some additional ones) are interpreted as local innovations.

### **5.9 'Nine'**

The main pattern for 'nine' ('9=5+4') is self-explanatory. This is the only pattern that can be reconstructed for Proto-Niger-Congo.

The alternative pattern '9=10–1' is much less common, whereas the pattern '9=6+3' (attested in Atlantic Bak) is exceptionally rare. The Platoid pattern '9=12– 3' seems to be unique, cf. Birom, '15=12+3', '9=minus 3', '10=minus 2'. Primary roots are attested in those languages (branches) that have a full set of primary terms covering the sequence from 'one' to 'ten' (which is a rare case), e.g. Bantoid *bukV* (if indeed primary), Akpes *ɔ̀-kpɔlɔ̄ ̀ʃ(ì)*, Defoid *\*sá(n), dà* (cf. Edoid *cien/*

5.10 'Ten'

*sin*), Igboid *totu/tolu*, Ubangi *kùsì*, *me-newá*, Laal *yàŋjáŋ*, Dogon *túwɔ́*, Mande *kònonto/kɔ̀nɔndɔ(n)* (historically perhaps '10–1').

### **5.10 'Ten'**

The root *\*pu/ fu* is the most likely candidate for the NC reconstruction. The distribution of its reflexes is shown in the chart below (Table 5.38).


Table 5.38: \**pu/fu* '10' in Niger-Congo

The roots listed in this chart are obviously related. The root is lacking in Kordofanian, where a variety of terms for ten are attested, e.g. *tu(l), rakpac, fəŋən, tiəɽum, 5pl.* This probably indicates that in Proto-Kordofanian the root for 'ten' was not present. The Dogon form *\*pɛ́rú/ pɛ́lú* has the same initial consonant, but our evidence is inconclusive as to whether it is related to the roots above. Finally, the Ijo form *(w)ójí* allows a twofold interpretation. If it is taken as *(w)ó-jí* based on *\*ji*, it is comparable to *zììyà* '10' attested in the Gola isolate. Alternatively, it can be analysed as a complex root *\*(w)o* '10' plus *ji* (< \*'1'). If so, it may be related to the roots quoted above (or at least to one of its allomorphs (?) attested in Kwa).

The presence of forms with the voiced **b-** in Adamawa-Ubangi requires an explanation. The evidence suggesting a connection between the **b-** and **f-** forms attested in these languages is insufficient. In view of this, it can only be noted that a similar phenomenon is observable within the Mande family: the form *\*bù* is reconstructed in the Southern group of the South-Eastern Mande branch, whereas in Western Mande (as well as in the Eastern group of South-Eastern Mande) the reconstructed form is *\*pu/fu*.

It should be noted that the Adamawa root with the initial voiceless labial is only marginally attested (e.g. in Munga (*fuə*) and Pere (*fób*)).

Raymond Boyd tentatively suggests that *fob* is to relatedhe tomain Adamawa root *\*kop*: *«*The Kutin group has *fóp* which may be related to *\*kóp»* (Boyd 1989:

#### 5 Reconstruction of numerals in Niger-Congo

162). However, an alternative explanation exists. A brief study of the Adamawa number systems shows that numerical terms attested within this family (unlike those found in other NC families) often end in **-p** or **-b**. The Tula system, one of the first quoted by Boyd in his excellent article, may serve as an example (Table 5.39).

Table 5.39: Labial suffix in Tula numerals


The final **-p** in 'eight' is easily explainable (possibly due to '8=4\*2). However, at least in the case of 'two' and 'ten', the final **-p** is attested in non-compound terms. In his discussion of the final **-p** in the Adamawa terms, Boyd suggested that we may be dealing with the suffix **\*-(a)p** (or **\*-(a)b**, with the devoicing characteristic of a reduced consonant inventory in the final position). < …> The same suffix also appears in group 1 in *\*naar-ap* 'eight', derived from *\*naar* 'four'. < …> Compare this situation with 'Bantoid' Vute: *'bɯ̄rɯ́p* 'two', *nà:sɯ̀p* 'four'' (Boyd 1989: 156). Furthermore, he challenges Kay Williamson's opinion on whether this morpheme was an original suffix or a suffix that developed out of a noun class prefix. The most important result of this discussion is that the suffix **\*-p/-b** found in numerical terms allows us to trace the Adamawa forms directly to NC *\*pu/po* without the intermediate *\*kop/kob*. As for the isolated Adamawa forms of *bo* 'ten', Boyd suggests a Chadic origin for them, although alternatively they may be related to the similar Ubangi root and reflect the NC root *\*pu / fu*.

The main Adamawa root *\*kop/kob* '10' should be discussed in a wider NC context as well. In view of the secondary nature of the final **-p/-b** in Adamawa (see above), this root is comparable to the NC roots *ko* 'ten ; hand'.

Direct BC parallels for this root (with the final labial) should be discussed first. We refer here to the hypothetical relationship of a number of forms discussed in Chapter 4, including Delta-Lower-Cross *-kɔp/du-op/du-ob* (Dimmendaal 1978 *\*lùgòp*) (cf. Bendi *kpu* '10', nearby *fo/ hwo*), Yukuben-Kuteb (Jukunoid) *kuwub*, Kainji *\*kop / ʔup / kpa* (together with *\*pwa/ pa*), and Platoid *\*kop*. This evidence suggests that more attention should be paid to the reconstruction of the allomorph \**kop* in both Proto-BC and Proto-Adamawa. This root should probably be

5.10 'Ten'

compared to the Kru root *kʊgba* '10', unless it is a non-compound root that goes back to *ko* (see below).

In view of Boyd and Williamson's interpretation of the final labial as a suffix, the forms quoted above should probably be treated together with the root *ko* '10', which is sporadically attested in multiple families. As noted above, it most probably goes back to the lexical root \**ko* 'hand', that represents one of the alternative Proto-NC reconstructions of this term. Its distribution with this meaning is as follows:

First of all, it is reconstructed by Moniño for Proto-Gbaya as *kɔ* ˜ *́* 'hand'. This root is also attested in Mande (at least in the Southern group of the South-Eastern Mande branch, cf. Vydrin's evidence: Proto-South-Eastern Mande *\*kɔ̃*'hand, arm'). In Kru, this root is attested not only in the Eastern group (Dida *kɔ̄*'hand'), but in the Western group as well (Glio-Oubi *hõ*, Krumen *hɷ̃*"). Finally, it is (admittedly only marginally) attested in Bantoid (as an alternative to the wide-spread root *kʊ́mɩ̀* '10'): according to Larry Hyman (in Paulin 1995) this root is distinguishable in Kom (*ə̄-kœ̂*) and Narrow Bantu, e.g. in zones B (Mpur *kɔ*, Yansi *kwɔ*) and E (Mashami *oko*, Meru *uko*, Nurse & Philippson 1975/1999). The Limba root *koh-* '10' probably belongs here as well.

It is difficult to say whether this evidence is sufficient for the Proto-NC reconstruction. However, when choosing between the two possibilities for the reconstruction of the term for 'ten' (i.e. from *\*pu/ fu* and *\*ko*) the first one should be preferred.

Among other roots relevant to our discussion, the following two roots (whose attestations are not limited to one family) are of interest: Gur *gba/kpa* '10' (cf. the BC root *gwo*/*jwo*) and Kwa *du* '10' (possibly related to the Adamawa root *d(u)o*; cf. also Kordofanian *ru* and Gur *nu/ nyu*?). The latter root may be compared to Bantu *\*dòngò* '10'. It is attested in seven zones (i.e. EGJMPR according to BLR3, but a number of attestations from D.62 are available, hence it is found in all five regions). BLR tentatively suggests a Bantu etymology for this root ('*spécilaisation de "ligne" dòng?*'). However, it has parallels in other BC branches, namely in Cross River (Connell 1991) and probably Idomoid (Table 5.40).

The use of numerous other roots for 'ten' is limited to one family, i.e. they are apparent innovations, such as in Bantoid *kum/kam* '10' (Bantu *kʊ́mì/ kámá*). The latter form (that sometimes coincides with the term for 'hundred') has an internal Bantu etymology: its tentative relationship to the lexical root meaning 'touch' is assumed in BLR 3 (BLR3: 'see also *kʊ́m* 'touch' - zones DHJLM'). However, the nasalization of the final segment in the Bantoid proto-form cannot be excluded. If this process indeed took place, this form becomes comparable to *\*ku*(*b*) as well as others discussed above.

#### 5 Reconstruction of numerals in Niger-Congo


Table 5.40: Parallels for Bantu *\*dòngò* '10' in Cross River and Idomoid

Other isolated forms for 'ten' include Atlantic *(n)taaj*, *taim, -suwan*, Mel *wɨtʃɔ?*, Western Mande *tan* (< \*'5'?), Gur *kɛ(n)*, Kwa *bula* (cf. Ubangi *bale*), Ubangi *busa*,*sui*, Kordofanian *tu(l), di, rakpac, fəŋən, tiəɽum*, Adamawa *kutu(n) (<\*kutu(n),* cf. Laal *tūū*, Kordofanian *ʈʌʌ*, Sua *tɛŋi* etc.

### **5.11 Large numbers ('twenty', 'hundred' and 'thousand')**

It is better to treat large numbers together for the following reasons:

First, these terms were probably lacking in Niger-Congo, so it comes as no surprise that they are often borrowed from European languages, Arabic, Hausa, Lingala or other "languages of influence".

Secondly, these roots are often identical, i.e. the root that means 'thousand' in one language may mean 'hundred' or even 'ten' in another. Some of the forms simply denote 'a large number'. The well-known migrating root *keme* that has the meaning 'hundred' in the majority of the Mande languages may be used with the meaning 'eighty' or even 'sixty' in other Mande languages.

However, each of the roots has its own characteristics.

In the majority of the NC languages, the term for 'twenty' goes back to lexical roots that mean 'person', 'leader', 'body', 'head', 'grain', 'sack' and 'large number'. Numerous examples of this kind are discussed in Chapter 4. The etymology of those terms for 'twenty' that seem to be primary at the synchronic level should be sought with this in mind.

5.12 Proto-Niger-Congo

It can be safely stated that the terms for 'hundred and 'thousand' were absent in Proto-Niger-Congo. Thus, the pattern 'twenty' = 'person' remains the only reconstruction possibility for large numbers in Proto-Niger-Congo.

### **5.12 Proto-Niger-Congo**

The reconstruction of the Proto-Niger-Congo number system may be summarized as follows (Table 5.41).


Table 5.41: Proto-Niger-Congo numeral system

This table summarizes our discussion. However, it is tempting to apply our conclusions to the evidence pertaining to particular families in order to identify the most archaic families, groups and branches within NC. Such a review of data within a wider NC context could also help, enhancing the intermediate reconstructions suggested in Chapter 4.

## **6 NC numbers as reflected in particular families, groups and branches**

No new reconstructions are presented in this chapter that offer the alignment of intermediate reconstructions on the basis of wider Niger-Congo evidence and conclusions based on the reconstruction suggested earlier. Hopefully, these results will enable an evaluation of each of the families (or a group/branch when possible) with regard to the inventory of NC roots preserved in them. In addition, this may enhance our understanding of the NC linguistic taxonomy. We will begin our analysis with the Benue-Congo evidence (Table 6.1).

### **6.1 Benue-Congo**

Commentary:



#### 6 NC numbers as reflected in particular families, groups and branches

Table 6.1: NC numerals reflected in Benue-Congo (+)

• The total number of Proto-Niger-Congo roots that have reflexes in each of the BC branches (out of the seven numbers represented in the table) is quoted in the rightmost column.

Table 6.1 demonstrates the following: If we accept this reconstruction, it appears that in only Cross-River do all seven terms discussed above directly reflect their NC prototypes, which makes this branch the most archaic within BC. Six terms out of seven represent NC reflexes in Kainji, Platoid, Bantoid, Bantu and Akpes. In other words, the Proto-NC numerical terms are better preserved in Eastern BC than they are in Western BC. It should be noted that only three terms out of seven have their reflexes in Idomoid and Igboid, i.e. they are the most distant from Proto-Niger-Congo among the languages under study.

6.2 Kwa

Reflexes of 'three' and 'four' have been preserved in all BC branches. The reflection of 'five' is consistent as well. The same can be applied to 'eight' (the replacement of the pattern '8' = '4 redupl.' with '8' = '5+3' may have occurred independently in some of the branches).

Why the assumed reflexes of the Proto-terms for 'two' and 'ten' underwent a massive replacement is more difficult to explain. In the case of 'ten' a Proto-Western-BC innovation may be assumed, i.e. the replacement of *\*pu/fu* with *\*gbV/gwV*. This is applicable to the Nupoid form *wo* (represented as /+?/in the table above) as it probably reflects the Western innovation \**gwo* rather than *\*pu/fu*. This raises doubts as to whether our interpretation of the forms attested in Cross (*\*kpo*), Jukunoid (*wo*) and Lufu (*wo*) is correct (these forms were explained above as NC).

The reflexes of the Proto-NC term for 'two' are limited to 4–6 branches (out of the fifteen branches under study). At the same time, the forms that do not go back to \**di* are phonetically quite homogeneous in both main groups of BC (*pa/ba/wa/va*). This suggests that the by-form of 'two' with the initial labial may have already existed at the Proto-BC level.

### **6.2 Kwa**

Interestingly, Table 6.2 shows that some of the Kwa branches are exceptionally variable with regard to the reflection of Proto-NC terms. All seven Proto-terms under study have their reflexes in Ka-Togo, i.e. the Ka-Togo reconstruction is virtually identical to that of NC. However, Gan-Dangme has only the reflex of 'three' (assuming that *-tɛ̃*'3' reflects NC \**tath*). In Nyo, the majority of terms are replaced as well: it seems that only the terms for 'three' and 'four' have been preserved in Proto-Nyo, whereas the preservation of 'ten' (not speaking of 'one' and 'eight', let alone the terms for 'two' and 'five', since the reflexes of *\*di* '2' and *\*tan* '5' are not traceable in any of the Nyo branches) is questionable. This means (assuming Ka-Togo, Na-Togo and Gbe indeed belong to Kwa) we should assume that: 1) the innovations presented in the table above postdate the division of Proto-Kwa; 2) Proto-Ka-Togo was the first language to separate from Kwa, since many of these innovations are homogeneous. This line of reasoning is more difficult to follow in the case of Na-Togo, since Na-Togo shares its innovations for 'two' (*\*nyɔ*) and 'five' (*\*nu*) with Nyo and Ga-Dangme. In other words, the Kwa numbers provide valuable data for the alignment of the internal genealogy of the Kwa languages.


Table 6.2: NC numerals reflected in Kwa (+)

One important point that I would like to stress here is that if the Ka-Togo languages indeed belong to Kwa, we may state that our reconstruction of the NC number system is fully supported by the Kwa evidence.

It should be remarked that in a number of the Kwa branches the forms of 'five' interpreted as innovations in the table above could go back to an alternative NC prototype \**nu*(*n*) '5' with its reflexes attested in Dogon, Gur and Adamawa.

Finally, I'd like to note that such a large-scale replacement of Proto-terms as in Nyo and Gan-Dangme (apparently etymologically related innovations) is a promising subject for both special investigation and discussion within the framework of a NC linguistics conference.

### **6.3 Ijo**

The Ijo languages are closely related, hence they do not differ much in the reflection of Proto-NC numbers. An apparent innovation of Ijo is the term for 'two' (mààmV). As for the term for 'one', the reflexes of the NC prototype are distinguishable in the Ijo compounds die/zie/ie. In the case of 'ten' it is, however, unclear whether this form is an innovation or not, since it can also be reconstructed as \**wo*-(*i*) based on \**pu*/*fu*. The reconstruction \*(*w*)*oji* < \*\**ji* is an alternative possibility that implies an innovation in Ijo.


Table 6.3: NC numerals reflected in Ijo (+)

In any case, the majority of the Proto-Ijo numbers can be traced to their NC prototypes.

### **6.4 Kru**


Table 6.4: NC numerals reflected in Kru (+)

The Proto-Niger-Congo forms are well-preserved in Western Kru (Bassa, Grebo, Klao, Wee). In other branches they are less well represented (especially in Aizi and Seme, where they are nearly completely replaced with innovations (except for the term for 'three') with reflexes attested in all the branches).

### **6.5 Kordofanian**

This evidence leads to the conclusion that the number systems of the Kordofanian languages are hardly reconcilable with each other. Moreover, none of them seems to have inherited the NC system (with the exception of 'three' that apparenly goes back to its NC prototype, cf. e.g. Katla *ʌ̀-t*"*ʌ*"*́t* '3').

The NC root for 'eight' (< '4') is not represented in the Kordofanian languages. The use of /+?/for Heiban and Talodi is only due to the fact that the Proto-NC


Table 6.5: NC numerals reflected in Kordofanian (+)

pattern (8 = 4 redupl.) is traceable in them (rather than the form itself), cf. e.g. Warnang (Heiban) *ŋè-làmlàŋ* '4' > *ŋe-lamlaaŋ-ɔ* '8', Lumun (Talodi) *mɔ́ʲɔ̀ɽɪ̀n* '4' > *má-mɔ̀ɾmɔ̀ɾ* '8'. This resemblance, however, may be due to typological (rather than etymological) reasons.

### **6.6 Adamawa**

It is important to note that Adamawa is one of the most divergent families within NC, hence the remarks below.

First, despite the diversity of forms, reflexes of the NC prototypes are well represented in many of the branches, e.g. five terms out of the total seven are probably reflected in Mbum Bua, Waja Jen, Waja Waja and Waja Yungur. Like in other families, the terms for 'three' and 'four' are the best-preserved.

The table above may create an impression that the term for 'one' is wellpreserved in Adamawa as well. This impression is, however, misleading, since multiple forms are reconstructible for 'one'. Moreover, numerical terms attested in particular Adamawa branches go back to a variety of forms (rather than one particular form) that may be unrelated to each other. Thus NC *di* '1' finds parallels in the following branches: Duru *də́ə*, Bua *\*lɛ* and possibly Laal *ɓɨ-dɨ ̀ l?. ́* Its reconstructed allomorph *\*n-di* (with further evolution to\**ni/-in*) may be reflected in Kam *(-i* ¯ *i* ¯ *)*, Jen *-ín*, Waja *-in*, Mumuye ( ?) -*n*i, Yungur ( ?) *-ni*. The terms reflected in Falo \*-*lo*, Bua *dʊ(ŋ* and Kim *ɗú* may go back to the reconstructed NC form *\*do* '1'.


Table 6.6: NC numerals reflected in Adamawa (+)

#### 6 NC numbers as reflected in particular families, groups and branches

The forms observable in these two groups cannot be coalesced on the basis of the presently available evidence. Moreover, it bears reminding that the morphological analysis of the majority of the Adamawa numbers is uncertain. This problem cannot be solved at the moment since any firm criteria for distinguishing noun class affixes (or their traces) from the base are lacking.

The same is applied to the forms of 'two'. The set of reflexes for the NC term *\*di* '2' quoted in the table above is represented by the following isolated forms: Bua *di-di/ri*, Kim *zí/tʃí-rí*, Day *dīí*, Jen *\*re / rá-b*, Waja *rɔ́-b*, Yungur*raa-p*. Regardless of whether the final **-b** goes back to a suffix or is the result of alignment by analogy (both possibilities are discussed above), it is clear that the relationship of these forms deserves careful examination in the diachronic perspective.

'Four'. This section of Table 6.6 is a result of our cautious treatment of the potentially related forms: the possibility that the forms of Kim-Day *nda* may go back to NC *\*na-* cannot be excluded.

The NC base *\*tan/ton* '5' has not been preserved in any of the Adamawa languages (apart from the doubtful Laal form). On the contrary, reflexes of the alternative NC form *\*nu(n*) are clearly distinguishable in the majority of the midrange NC families such as Dogon, Gur and Kwa, so they should have probably been marked with the plus sign in the table above.

As for the reflexes of 'ten' (NC*\*pu/fu*), it should be noted that all forms marked with the plus sign in the table originally had a voiced labial as their initial consonant: Adamawa *\*buu/buu*. The forms of Adamawa *\*ko-b* probably go back to NC *\*ko* 'hand'.

### **6.7 Ubangi**

Here, NC numbers are well-preserved in Banda and Gbaya-Nanza-Ngbaka (each of these branches has four reflexes out of seven) whereas in Ngbandi they have been totally replaced (except for *ta* '3').

The following problematic forms that have been taken as NC reflexes can be reinterpreted as follows (with due attention to their morphological structure and phonetics):

NC *\*di* '1': Banda *bà-lē?,* Ngbaka-Mba *ɓī-nì/bì-rì*, Zande *kí-lī*;

NC *\*pu/fu* '10': Banda *bu-fu*, Gbaya *ɓú/ɓù-kɔ̀*. Whether the latter form is indeed a NC reflex is not clear (not only due to its phonetics but also because a lexical etymology is suggested for *ɓù*), e.g. Edouard Koya states that *ɓù* means 'person' in Bokoto (Central Gbaya-Manza-Ngbaka), where *ɓù-kɔ̀* '10' (https://mpilingweb.shh.mpg.de/numeral/Bokoto.htm). Moniño suggests an alternative ety-

6.8 Dogon


Table 6.7: NC numerals reflected in Ubangi (+)

mology (Moñino 1995: 656): *«\*ɓú 'dix' est en relation avec \*ɓú* **'façonner, faire un cercle, joindre les mains' ; la série partielle** *\*ɓú-kɔ̃́* **'joindre-mains' est encore plus explicite, et décrit le geste qui accompagne l'énonciation du chiffre 10 chez tous les locuteurs».** The following meanings of *ɓú* in Gbaya are provided in (Blanchard & Noss 1982: 51):


It is entirely possible that we are dealing with an innovation that follows the pattern described by Moniño. However, similar forms attested in other families may suggest that as finger counting developed, the secondary merger of homonyms occurred.

Finally, the Proto-Ubangi terms for 'two' (*\*se/so*) and 'five' (*\*ko/vo*, possibly a derivative from 'hand') should be mentioned as possible shared innovations.

### **6.8 Dogon**

The Dogon numbers are quite homogeneous, so there is probably no need to treat them by branch. Instead, they will be compared to the numerical terms attested in the Bangime language that is considered a NC isolate.

#### 6 NC numbers as reflected in particular families, groups and branches


Table 6.8: NC numerals reflected in Dogon

**Dogon**. The forms*lɛ́(y)/nɛ́(y)* (with their allomorphs*lɔ́(y)/nɔ́*(*y*)) may be viewed as reflexes of NC *\*di* '2'. The reflex of NC *\*tan/ton* '5' is lacking in Dogon, but the basic form quoted in the table above corresponds to the alternative NC root *\*nu(n*) widely attested in a number of NC families. The term for 'ten' can be compared to *\*pu/fu*, but this comparison should be substantiated. As previously stated, the reflexes of 'three' (Dogon *\*taan*) and 'four' (Dogon *\*nay*(*n*)) appear to be the most consistent, which clearly identifies Dogon as a member of the NC family.

**Bangime**. The Bangime numbers are virtually identical to those of Dogon as far as their etymology is concerned. The form *jíndò* '2' may be a palatalized reflex of *\*di*. The term for 'eight' (*sàáɡín*) is a borrowing from Mande (just as in Dogon where a by-form of this primary term (*sagi*) is widely attested). The only Bangime term that is markedly different from the one found in Dogon is 'ten'.

### **6.9 Gur and Senufo**

Evidence of the ten Gur branches is treated in Table 6.9 (cf. the discussion pertaining to the division of Gur into 16 branches in Chapter 4).

The Southern branch of Central Gur (Dogoso-Khe, Gan-Dogose, Grusi, Kirma-Tyrama) has preserved most of the NC terms (six out of the total seven), whereas its Northern branch (Bwamu, Kurumfe, Oti-Volta) preserved five. The NC numbers are well-represented in Teen and Wara-Natioro as well. Nearly the entire inventory of NC terms was replaced in Senufo (except for 'three' – Senufo *\*tà̃ã/taàr*), Bariba (except for *i-ta* 'three' and *ǹ-nɛ* 'four') and Kulango (except for *na* 'four and *tɔ* 'five'). At the same time, Kulango and Teen seem to be the only languages that have a reflex of NC *\*tan/ton* '5'.

As we have seen, the NC numbers are well-preserved in Gur, the more so that an alternative root for 'five' *(\*nu(n*)) is distinguishable in at least four NC families. Its reflexes are attested in Bariba, Central, and Senufo. In view of this, it can be stated that all seven Proto-NC terms are reflected in Southern Central.


Table 6.9: NC numerals reflected in Gur and Senufo (+)

The term for 'one' is marked with the plus sign in reference to the reflexes of NC *\*do* (Central, Lobi-Dyan, Viemo) or NC *\*di* (Central, Tiefo).

Proto-Oti-Volta (Northern Central) *\*li/yi* and Proto-Grusi (Southern Central) *\*lɛ/le* forms are considered to be reflexes of NC *\*di* '2'. Other forms of 'two' listed in the table represent a common (Proto-Gur ?) innovation \**nyo/jo /(ni* ?).

The Kulango term for 'three' (*sããbe*) must be a borrowing from Mande.

The innovations for '4' are isolates that are irrelevant to the grouping of branches within the Gur family.

Some innovations for 'five' may go back to the lexical root for 'hand' (< *\*ko*). The pattern for 'eight' (= '4 redupl.') is preserved in three of the branches.

In the case of 'ten', the similarity between the Senufo and Tiefo innovative forms is noteworthy.

6 NC numbers as reflected in particular families, groups and branches

### **6.10 Mande**

This is no doubt the most isolated family in what pertains to the reflection of NC numbers (Table 6.10). The maximum number of reflexes attested in particular branches does not exceed three (out of the total seven). In some of the branches, only two terms have been preserved. At the same time, the branches are quite compact, which enables us to discuss shared innovations within the Proto-Mande number system. The question as to whether these Proto-Mande innovations are of a lexical or morphological nature remains.

The most 'radical' etymological scenario is as follows:

The term *keden* '1' could be explained as going back to *\*ku-den*, which correlates well with the Proto-NC form *\*ku-di(n)* (with **ku-** being the most likely Proto-NC noun class prefix (class 1)).

The term *do* '1' is in line with the alternative NC root *\*do* '1' (without a noun class marker).

The Mande term *\*fida/fide* could be interpreted as going back to *\*fi-de* (assuming the first syllable reflects a noun class, e.g. CL 19).

The term for 'three' could be interpreted as a compound, one that has a reflex of *\*ta* '3' (< *\*tath*) as its first component (the second component remains unidentified).

The Mande term for 'ten' (\**tan*) as found in Western Mande may be a reflex of the Proto-NC form \*tan 'five' with a semantic shift \*'5' > '5PL' (='10'). Moreover, its original form may have been preserved in Jowulu.

Any of these bold assumptions may prove true, but presently none of them is substantiated enough, so they are better left for future discussion in the hope that over time more pertinent evidence will become available. In this respect, the study of Samogo and Jowulu looks promising, the more so that the lack of an upto-date linguistic investigation of these languages, as far as I know, has been a sore gap in present day comparative-historical studies of the Mande languages. In addition, these languages are the only ones that seem to preserve reflexes of both NC terms for 'five' (NC *tan/ton* and *\*nu(n*)). Moreover, the Jowulu terms that have [p-] ~ [b-] allomorphs may reflect a noun class prefix (the choice between **p-** and **b-** depends on the following consonant, i.e. [p-] appears before a voiceless consonant (cf. *p-ʃɪrɛ* '4') whereas [b-] appears before a voiced consonant (*b-zei* '3', *b-ʒĩĩ* '10').


Table 6.10: NC numerals reflected in Mande (+)

### **6.11 Mel**

The numeral system of the proto-language is generally poorly preserved in both of the Mel groups. However, it should be noted that the most apparent innovations ('four' and 'two') are found in both groups, thus being important isoglosses useful to the assessment of Proto-Mel.


Table 6.11: NC numerals reflected in Mel (+)

/~/in the section dealing with the Northern Mel term for 'five' indicates that it allows for a two-fold morphological analysis, namely *kə-ʈa-maʈ* (< \**kə-ʈa+suffix* < root *ʈa* 'hand'?) or (< *kə-ʈa-m-aʈ* < root *mV*).

In the Northern group, as well as in a number of other NC families, the term for 'one' is reconstructible as CL-*in* '1' (< NC *\*n-di*). The forms reconstructed for the Southern group include *\*lɛ, \*lɔ* '1' (< *\*di*, *\*do*). Languages of the Northern group preserve the basic form of 'ten', cf. Landuma *pù* '10', Temne '10'.

### **6.12 Atlantic**

The Atlantic languages comprise two major groups, namely Northern and Bak (the members of the latter are highlighted in grey in the table above).

The Proto-NC numbers are generally better represented in Northern rather than in Bak (cf. the distribution of data pertaining to 'three', 'four' (generally the most persistent terms) and 'ten' in the table above). The only Northern sub-group where the Proto-NC numbers are poorly preserved is Cangin, while Fula-Sereer, Tenda, Wolof and Nalu are the most conservative.

The distribution of reflexes and innovations presented in the table above suggests the following historical development:

Reflexes of all major Proto-NC terms were present in Proto-Atlantic. The distribution of the terms for '1' may point to the existence of two dialect zones. A form that goes back to NC *\*(n)-di* '1' became predominant in the ancestral dialect of Proto-Northern, whereas in the ancestral dialect of Proto-Bak the main form was NC *\*do* '1'. A specific phonetic (or morphological?) innovation of Proto-Atlantic (in contrast to NC) is the presence of the final **\*-k** in its numerical terms.


Table 6.12: NC numerals reflected in Atlantic

Proto-Northern inherited all basic Proto-Atlantic terms that go back to NC prototypes.

The term for '2' has been preserved in Peul-Sereer (\**di-k* '2') and in Nalu (in all three languages). A (shared?) innovation developed in Cangin and Nyun-Buy (*\*na-k* '2'). Another innovation is characteristic of Tenda-Jaad-Biafada (*\*ki* '2').

The terms for 'three' and 'four' have been preserved in the majority of the Northern Atlantic languages (cf. e.g. Proto-Fula-Sereer *\*tati-k* '3', *\*na(y)i-k* '4').

The NC root *\*tak/tok* '5' is probably reflected only in Fula-Sereer*(\*ɓe-tV-k*) and Buy (*ju-roo-g*, cf. Wolof *\*ju-rom* ?). In the majority of the Northern languages the original form was replaced with the pattern '5' < 'hand', which may have influenced the replacement of the pattern \*'8' = '4 redupl.' with '8' = '5' (hand') + 3.

#### 6 NC numbers as reflected in particular families, groups and branches

The term for '10' has been preserved in three sub-groups (Wolof *\*fu-kk*, Tenda *\*pə-xw*, Jaad-Biafada \**po*). In the remaining sub-groups it is replaced with isolated innovations.

The Proto-Bak numeral system underwent dramatic changes.

The original term for 'two' was replaced with the innovation -*ɬubəʔ* '2', with its reflexes being traceable in three out of four sub-groups.

The reflexes of the Proto-NC terms for 'three' and 'four' are lacking. Moreover, a shared innovation *baakər* '4' is observable in Joola-Manjak.

The original term for 'five' has been preserved in numerous Joola dialects, including Bayot (Proto-Joola \**fu-tɔ-k* '5').

The Proto-pattern '8' < '4' has been preserved in Manjak (Mankanya *ŋɨ-bakɨr* '4' > *bakɾ-ɛ̂ŋ* '8', Pepel *ŋ-uakr* '4' > *bakar-i* '8') and Balant (despite the fact that the original term for 'four' was replaced with an innovation in this language, cf. Balant Ganja *tàllá* '4' > *táhtállà* ~ *tántállà* ~ *táttállà* '8' as recorded by Denis Creissels).

The term for '10' was replaced with innovations. Here (just as in the case of '4') we have another shared Joola-Manjak innovation (*ntaaja*). This seems to be another solid argument in favor of grouping these languages together.

### **6.13 West African NC isolates**

We will conclude with an overview of the number systems attested in three NC isolates. These languages are traditionally grouped together with Mel or Atlantic (for seemingly no substantial reason, see Pozdniakov & Segerer 2007).

Table 6.13: NC numerals reflected in Sua (+)


The reflexes of 'three' and 'four' have been preserved in Sua (*b-rar* and *b-nan* respectively). It should be noted that the innovation for 'two' is comparable to that found in Mel.

The term for 'ten' is possibly a borrowing from Mande *tan* '10'.

The term for 'five' may reflect the alternative NC root *\*nu(n*) '5' (Gola *nɔ̀ɔ̀nɔ̀ŋ*). The forms for 'five' and 'ten' in the Koelle records include [-f]: *ta-sóóf ~ ka-sóóf*

'5', *koof* '10'.

6.14 Summary


Table 6.14: NC numerals reflected in Gola (+)

The form *bi-le* 'two' is noteworthy in that it may be interpreted as a direct reflex of NC *\*be-di* '2'.

### **6.14 Summary**

The results of our reconstruction of the basic numeral terms are presented in Table 6.16.


Our step-by-step reconstruction has yielded the following results.

The terms for 'three' and 'four' (*\*tath* '3' and *\*na(h)i* '4' respectively) are, as expected, the most stable within the NC number system. Their reflexes are rarely absent.

Surprisingly, the term for '2' appears to be the least persistent (the more so that this is the only numerical term on the Swadesh list). The reconstructed root for 'two' (*\*di* '2') is traceable in nine (out of nineteen) branches only. This may raise doubts as to whether the proposed reconstruction is correct. However, as we have tried to demonstrate above, no alternative reconstruction suggests itself on the basis of available evidence. The term for '2' shows a great variety of forms, at the


Table 6.16: Niger-Congo numerals reflected in various families (+)

same time being surprisingly persistent in particular branches (and other times rather divergent). Thus, the apparent Mande innovation *\*pila/fila* '2' is present in all Mande languages.

The most conservative NC branches in terms of the reflection of Proto-NC numbers are Gur, Adamawa and Kwa. All bases/patterns listed in the table have been preserved in Gur, including the alternative bases for 'one' and 'five'. The only reflex that is missing in Adamawa (as well as in Ubangi) is *\*tan/ton* '5'. All

6.15 Conclusion

Proto-terms have their reflexes in Kwa (except for the alternative base for 'one', i.e. *\*do*).

The inventory of the Proto-NC terms is well-preserved in the Bantoid languages, with only two alternative bases lacking (*\*do* '1' and *\*nu(n)* '5'). These reflexes are missing in other BC branches outside the Bantoid languages as well. The reflex of *\*pu* '10' is not present in Bantu as it was replaced with the Bantoid innovation *\*kum/kam/ɣam* (Proto-Bantu \**kʊ́mì/kámá* '10').

It would seem improper to define the branches with the lowest number of NC reflexes as the most distant from Proto-NC. The probability of finding a reflex of a NC-prototype in an isolate (e.g. Gola or Laal) is much less than, say, in the huge Benue-Congo family. At the same time, the massive replacement of numerical terms in the small West African branches such as Bak (Atlantic), Mel and Dogon is noteworthy.

The Kordofanian languages are the most remote from Proto-NC, as the only term with a NC prototype attested in them is *tath* '3'. The term for '8' is based on '4', which may be seen as another bond between Kordofanian and Proto-NC. However, this pattern may have developed in Kordofanian independently.

### **6.15 Conclusion**

In conclusion, I would like to highlight the thesis that I personally consider to be the most important. For me, the current study is an experimental project that aspires to demonstrate what can be done (if anything) in terms of the NC reconstruction, given that a step-by-step reconstruction is not available for all the families and branches of this macro-family.

In this experiment, the emphasis was placed on providing an exhaustive account of the distribution of forms by families, groups and branches. Quasi-reconstructions of Proto-NC numbers that resulted in the process should be viewed as mere possibilities. My intention was to present evidence that the reconstructions offered in this book are more probable than any others.

The author sees his major goal as providing a substantial discussion of the most likely reconstructions of Proto-NC numbers, in the hope that linguists specializing in particular NC families (as well as those who provide speculative 'etymologies') will finally join the debate. Chapter 4, which is the lengthiest and the most important chapter of the book, contains 'technical proposals' regarding the reconstruction of numbers within each of the numerous branches of the macro-family. I would like to thank the specialists who kindly joined the discussion while the book was still in preparation and whose opinions were duly

accounted for. I would be grateful if other specialists critically examined the evidence presented in this book and gave their evaluation of data that lies within their competence. Hopefully, this will give way to the real reconstruction of the NC number system. Today it is evident that plausible reconstructions in terms of a macro-family that comprises one and a half thousand languages can only result from the cooperation of dozens of specialists. This book aims at providing data for such an effort.

I hope that the methodology tested in this book will be of use for the reconstruction of the NC lexicon in general. In any case, the author sees no other way of approaching this objective of utmost importance in the coming decades.

## **Appendix A: Groupings of numerals by noun classes in 254 BC languages**

Table A.1: Akpes


Table A.2: Bantu A



Table A.3: Bantu B

Table A.4: Bantu C


Table A.5: Bantu D


Table A.6: Bantu E


Table A.7: Bantu F



Table A.8: Bantu G

Table A.9: Bantu H



Table A.10: Bantu J

Table A.11: Bantu K



Table A.12: Bantu L

Table A.13: Bantu M


Table A.14: Bantu N


Table A.15: Bantu P


Table A.16: Bantu R


Table A.17: Bantu S


Table A.18: Beboid


Table A.19: Cross



Table A.20: Defoid

Table A.21: Edoid


Table A.22: Grassfields


Table A.23: Idomoid


Table A.24: Igboid


Table A.25: Isimbi


Table A.26: Jukunoid


Table A.27: Mamfe


Table A.28: Mbam



Table A.29: Mbe


Table A.30: Ndemli

Table A.31: Nupoid


Table A.32: Oko


Table A.33: Platoid


Table A.34: Tivoid


## **Appendix B: Statistics of numeral groupings by noun classes in 254 BC languages**

The number of languages with a numeral-specific class marker (that is different from those used with other numerals, including the zero marker) is specified under *Specific CL*. E.g. there is a specific marker for 'one' in 174 languages (out of the total 254). At the same time, a specific marker is rarely used for the term for 'three', attested in only six languages. The next row (*Distant grouping*) accounts for the cases when a numerical term is grouped by class not with the adjacent number but rather with another term that is separated from it by a at least one other number. E.g. the grouping with non-adjacent numbers by class is attested for the term for 'four' in six of the languages under study. In one of the Eggon dialects it has the same class as the term for 'six' (*ù-ɲí* '4', *ù-fín* '6'), whereas the rest of the numerals belong to other classes. In Icheve, the term for 'four' shares its class with the term for 'eight' (*mí-ɲɪ̀n* '4', *mí-nùínì* '8'), likely because 'eight' derived from 'four' in this language. At the same time, this class is not characteristic of other numerals. A similar situation is observable in Kenyang, the only difference being that the noun class attested with 'four' and 'eight' also includes 'nine' (*mɛ́-nwî* '4', *mɛ́-nɛ̀n* '8', *mɛ́-nɛ̀n nɛ̀ àmɔ̀t* '9' (8+1)). The group '4'/ '8-10', which is distinguishable in two Grassfields languages (Yemba (Dschang) and Ngiemboon – **le-** class) belongs here as well.

The widest-attested (as well as lacking) groups for each number within a column are marked in red. For example, under 'one' we see that a specific noun class incompatible with other numerals is attested with the term for 'one' in 174 languages (out of the total 254). This is the most typical situation, e.g. a specific noun class for 'one' and 'two' incompatible with other numbers is observable in four languages only. The study of the widest-attested combinations of numbers and class markers shows that a specific class marker is often used with the BC terms for 'one', 'seven', 'eight', 'nine' and 'ten', whereas the terms covering the sequence from 'two' to 'six' are often grouped by class with other numbers, i.e. with each other to be precise.



**Appendix C: Alignments by analogy**



## **Appendix D: Numerals for '1' in the Cross languages**



#### D Numerals for '1' in the Cross languages

## **Appendix E: The main sources for the 1000 NC languages cited**

The NC languages and their main sources are organized by family. Within the Benue-Congo family they are then organized by groups, and within Bantu they are organized by zones. The second column lists the main bibliographical sources. The third column indicates the names of the contributors in Chan's database [Chan]. A semicolon separates each source.


Table E.1: BC: Bantoid







Table E.3: BC: Bantoid: Bantu, B



#### Table E.4: BC: Bantoid: Bantu, C


Table E.5: BC: Bantoid: Bantu, D

Table E.6: BC: Bantoid: Bantu, E




Table E.7: BC: Bantoid: Bantu, F


#### Table E.8: BC: Bantoid: Bantu, G


Table E.9: BC: Bantoid: Bantu, H





Table E.11: BC: Bantoid: Bantu, K

Table E.12: BC: Bantoid: Bantu, L



#### Table E.13: BC: Bantoid: Bantu, M

#### Table E.14: BC: Bantoid: Bantu, N


#### Table E.15: BC: Bantoid: Bantu, P



Table E.16: BC: Bantoid: Bantu, R

Table E.17: BC: Bantoid: Bantu, S



#### Table E.18: BC: Bantoid Grassfields


Table E.19: BC: Cross



Table E.20: BC: Defoid

Table E.21: BC: Edoid



Table E.22: BC: Idomoid




Table E.24: BC: Kainji, Platoid



#### Table E.25: BC: Jukunoid





#### Table E.27: BC: isolates

#### Table E.28: Kwa




### Table E.29: Ijo



Table E.30: Kru



Table E.32: Kordofanian


Table E.33: Adamawa



Table E.34: Ubangi









Table E.37: Gur–Senufo




Table E.38: Mande





#### Table E.39: Atlantic





Blench, Roger. 2007. *Comparative Ijoid wordlist*. Ms.



Burssens. 1994. *Yansi*. Ms.





Lamp, Frederick John. 2016. *Baga Tshi-tem*. Electronic document.


l'Société des Études Linguistiques et Anthropologiques de France (SELAF) avec le concours du Centre National de la Recherche Scientifique (CNRS).



Rongier, Jacques. 1996. Aperçu sur le moyobe. *Cahiers voltaïques/Gur papers* 1.


Roulon-Doko, Paulette. 2008. *Dictionnaire gbaya-français*. Paris: Karthala.

Rowland Oke, Mary. 2003. *Description systematique de la langue obolo-andoni*. Paris: L'Harmattan.



*mande (Языки мира: Языки манде) [Languages of the world: Mande languages]*, 172–212. St. Petersburg: Nestor-Historia.


*tive Afrikanistik: Sprach-, geschicht- und literaturwissenschaftliche Aufsätze zu Ehren von Hans G. Mukarovsky anlasslich seines 70. Geburtstages*, 1–2. Wien: Afro-Pub.


Abiodun, Michael Ajibola, 359 Agoyi, Taiwo O., 359 Akpes, 55 Association pour la promotion de la langue Mamara, 371 Baga Fore, vii Bakpa, Mimboabe, 371 Balant, vii Bao Diop, Sokhna, 377 Barry, Abdoulaye, 375, 376 Bassène, Alain-Christian, 375 Beavon, Keith H., 338 Beavon, Mary, 338 Belliard, François, 338 Bendor-Samuel, John T., 3 Bertho, [Révérend] [Père] J, 360 Beyer, Klaus, 371 Biaye, Séckou, 42, 246, 375 Bird, Steven, 351 Blanchard, Yves, 303 Blecke, Thomas, 373 Blench, Roger, 57, 58, 95, 138, 342, 352, 353, 357–359, 361, 364, 366, 367 Bloemarts, Maarten, 278, 370 Bôle-Richard, Rémy, 360 Borchardt, Nadine, 359 Bostoen, Koen, 340, 346, 348 Botne, Robert Dale, 342 Bouquiaux, Luc, 99, 357 Boursier, Daniel, 367

Boyd, Raymond, 57, 72, 105, 148, 153, 259, 282, 289, 290, 336, 365– 367 Boyd, Virginia L., 367 Boyeldieu, Pascal, 158, 363, 365, 366 Brindle, Jonathan, 370 Brisson, Robert, 367 Brosnahan, Leonard F., 334, 352, 353 Bryant, Daniel, 369 Bua, vii Bühnen, Stefan, 377 Buis, Pierre, 376 Burmeister, Jonathan, 359 Burssens, 340 Byarushengo, Ernest Rugwa, 346 Calame-Griaule, Geneviève, 369 Carlson, Robert J., 216, 224, 372, 373 Carlton, Elizabeth M., 375, 376 Carrington, John F, 11, 341 Chan, Eugene S. L., 335–348, 351– 377 Childs, George Tucker, 369 Christaller, Johann Gottlieb, 361 Clarke, Mary Lane, 363 Cloarec-Heiss, France, 367 Cobbinah, Alexander Yao, 377 Connell, Bruce A., 74, 291, 333, 334, 352, 353 Coupez, André, 342 Crabb, David Wendell, 335, 336 Crane, Thera M., 340

Creissels, Denis, 42, 246, 373–375 Crétois, Léonce, 377 d'Alton, Paula, 377 d'Avezac, Armand, 244, 374, 375 Dalby, David, 369 De Grauwe, Jan, 346 De Lespinay, Charles, 377 De Rasilly, Père, 278, 370 De Rendinger, R., 365, 367 De Wolf, Paul Polydoor, 375 Dendo, Mallam, 357 Dettweiler, Sonja G., 84, 356, 357 Dettweiler, Stephen H., 84, 356, 357 Diagne, Mbacké, 375 Dièye, El Hadji, 376 Dimmendaal, Gerrit Jan, 73, 74, 290, 333, 334, 352, 353 Diouf, Jean-Léopold, 377 Djilla, Mama, 216, 373 Dombrowky-Hahn, Klaudia, 372 Doneux, Jean Léonce, 376 Donzo Bunza, Jean-Pierre, 341 Dorvlo, Kofi, 361 Dumestre, Gérard, 373 Durieux, Jude, 363, 368, 369 Durieux-Boon, Evelin, 363, 368, 369 Eboué, Félix, 367, 368 Egner, Ingeborg, 362 Ekambi, Aline Etondi Boumda, 336 Elders, Stefan, 197, 371 Elias, Philip, 364 Elugbe, Ben Ohi[omambe, 276, 354 Ernst, Urs, 338 Fabre, Anne Gwenaëlle, 366

Fenning, Charles D., 213 Ferry, Marie-Paule, 375–377

Fiedler, Ines, 371, 373 Flavier, Sébastien, 3 Fransen, Margo A. E., 351 Fresco, Edward M., 354 Ganong, Tina Weller, 279, 369 Gardner, Ian, 352 Gaved, Tim, 376 Golovko, Ekaterina, 375 Green, Eldred I. Ibibiem T., 361 Greenberg, Joseph Harold, 2, 4, 336 Grégoire, Henri Claude, 373 Grollemund, Rebecca, 339 Guéhoun, Augustin, 362 Guest, Elizabeth, 364 Halaouï, Nazam, 40, 374 Hantgan, Abbie, 363 Harley, Matthew W., 47, 361 Heath, Jeffrey, 368, 369 Henson, Bonnie J., 338 Hérault, Georges, 359–361 Hochstetler, Lee, 216, 277, 373 Hyman, Larry M., 351, 358 Ibrahim-Arirabiyi, Femi, 359 Ikaan, 55 Innes, Gordon, 362 Janssens, Baudoin, 58 Jisa, H., 351 Joly, A, 365 Jones, Ross, 373 Jungraithmayr, Herrmann, 365–367 Kagaya, Ryohei, 349 Kaliai, M. H. I, 139, 361 Kamba-Muzenga, Jean-Georges., 348 Kari, Ethelbert E., 334, 353

Kastenholz, Raimund, 151, 374 Kato, Barau, 95, 358, 366 Keita, Mamadou, 359 Kéné, 368, 369 Khachaturyan, Maria, 374 Kleinewillinghöfer, Ulrich, 151, 372 Koelle, Sigismund Wilhelm, 42, 83, 139, 217, 247, 248, 277, 278, 292, 333, 334, 336, 351–356, 358–363, 369–371 Koni Muluwa, Joseph, 340, 346, 348 Konoshenko, Maria, 374 Kraft, Charles H., 335, 367 Kropp Dakubu, Mary Esther, 44, 351 Kushnir, Elizaveta, 374 Kuznetsova, Natalia, 373 Kuznetsova, Olga, 373 Laal, vii Lamp, Frederick John, 369 Le Bris, Pierre, 373 Lessau, Donald Andreas, 374 Lufu, 55 Lukas, Johannes, 365, 366 Mackay, Hugh D, 355 Maddieson, Ian, 335, 336 Magaji, Daniel J., 358 Maganga, Clement, 344 Maloletnyaya, Anna, 373 Manus, Sophie, 349 Marchese, Lynell, 362 Mbah, Mathaus N., 351 Medjo Mvé, Pither, 339 Melzian, Hans J., 374 Meyer, P. Gérard, 376 Miehe, Gudrun, 52, 198, 370, 371 Mishchenko, Daria, 374

Moñino, Yves, 51, 172, 173, 303, 367, 368 Montlahuc, Marie-Laure, 343 Morris, Pamela, 374 Motingea Mangulu, André, 341 Musinguzi, Charles, 347 M'Bodj, Chérif, 377 Naden, Anthony Joshua, 198 Ndao, Dame, 42, 377 Newcomer, Betsy, 368 Nikitina, Tatiana, 374 Noss, Philip A., 303 Nougayrol, Pierre, 365 Nurse, Derek, 272, 277, 291, 339, 340, 342–347, 349 N'Guessan, Jérémie Kouadio, 360 Oko, 55 Olson, Kenneth S, 367 Orungu, vii Ouzilleau, François Marie Frédéric, 367 Pairault, Claude, 366 Paperno, Denis, 373 Paulin, Pascale, 291 Payne, Stephen, 376 Perekhvalskaya, Elena, 226, 227, 373, 374 Philippson, Gérard, 272, 277, 291, 339, 340, 342–347, 349 Pichl, Walter J., 369, 376 Pozdniakov, Konstantin, 6, 12, 23, 29, 35, 158, 229, 231, 234, 236, 252, 267, 310 Prost, André, 201, 213, 216, 371, 373 Raen, Konstanse, 366 Rand, Sharon R., 375, 376

Reinike, Brigitte, 371, 372 Robert, Stephane, 376 Robert, Stéphane, 12 Rogers, Kirk, 369 Rongier, Jacques, 353, 359, 362, 371 Roulon-Doko, Paulette., 367 Rowland Oke, Mary, 334, 353 Ruelland, Suzanne, 366 Sachnine, Michka, 354 Sachot (Santos), Rosine, 376 Salama-Gray, Kisanga, 342 Sambou, Pierre, 374, 376 Sambou, Pierre-Marie, 376 Sapir, J. David, 252, 375 Sawadogo, Tasséré, 201, 370–372 Schadeberg, Thilo C., 344, 364 Sebeok, Thomas A., 2 Segerer, Guillaume, 3, 6, 229, 231, 246, 252, 267, 310, 363, 375, 376 Seidel, Frank, 376 Seydou, Christiane, 375 Shimizu, Kiyoshi, 366 Simons, Gary F., 213 Smeltzer, Brad, 373 Smeltzer, Suzan, 373 Smith, Rebecca Dow, 357 Snider, Keith L., 360, 361 Solomiac, Paul, 373 Soubrier, Aude, 360 Stammers, Jon, 376 Stewart, John Massie, 270 Suggett, Colin, 372 Sumbatova, Nina, 369 Sweetman, Gary, 365 Tadadjeu, Maurice, 351

Taylor, Charles V, 347

Taylor, Frank William, 375 Tham, Florian, 198, 371 Thomas, Northcote Whitridge, 362, 369 Tingbo, Th., 367 Tourneux, Henry, 375 Trifkovic, Mirjana, 376 Urua, Eno-Abasi, 334, 352 Van der Veen, Lolke, 337–341 Vanderelst, John, 364 Vanhoudt, Bettie, 56 Vansina, Jan, 341 Vogler, Pierre, 140, 362 Von Roncador, Manfred, 371 Vydrin, Valentin, 41, 213, 226, 227, 363, 373 Vydrina, Alexandra, 374 Vydrine, Valentin, 220, 374 Weiss, P. Henri, 375 Welmers, William, 374 Westermann, Diedrich, 2, 363, 374 Williams, Gordon, 376, 377 Williams., Sara, 376, 377 Williamson, Kay Ruth Margaret, 2, 3, 8, 55, 72, 104–107, 335, 336 Wilson, William André Auquier, 42, 279, 363, 369, 375–377 Winkelmann, Kerstin, 195, 199, 370– 372 Wintz, R. P., 376 Wolff, Hans, 352 Yaya, Daïrou, 375

Ábādṣa, 83, 355 Abbey, 49, 122, 359 Abiji, 49, 122, 359 Abon, 267, 335 Abron, 44, 45, 271, 287, 359 Abua, 267, 352 Abuan, 321, 334, 352 Abure, 45, 359 Acheron, 31, 280 Acipa, 85–88, 90, 91, 93, 94 Adampe, 119, 359 Adele, 46, 47, 121, 126, 127, 359 Adioukrou, 49, 359 Agatu, 107, 322, 355 Agaushi, 85–88, 91, 93, 94 Agni, 45, 125, 271, 359 Agoi, 333, 352 Agwagwune, 321, 333, 352 Ahanta, 45, 45<sup>4</sup> , 49, 125, 359 Ahlo, 120, 359 Aizi, 140–143, 299, 362 Aja, 47, 119, 359 Ajumbu, 61, 335 Akan, 44, 49, 126, 287, 359 Akaselem, 185, 186, 370 Akebu, 47, 359 Akoose, 315, 337 Akpes, 2, 73, 103, 105–117, 288, 296, 315, 359 Aku, 277, 354 Akum, 18, 323, 358

Alago, 21, 107, 322, 355 Alege, 23, 283, 288, 333, 352 Alladian, 49, 123, 126–137, 298, 359 Amo, 85–88, 90, 91, 93, 94, 275, 356 Anaang, 275, 334, 352 Anii, 46, 121, 359 Animere, 120, 359 Anufo, 45, 360 Arabic, 145, 154, 155, 163, 164, 170, 173, 174, 180, 181, 292 Ari, 49, 360 Ariɡidi, 78, 354 Aro, 83, 355 Ashanti, 44, 360 Asu, 318, 345 Attié, 122, 123, 127–137, 298, 360 Avatime, 48, 120, 360 Avikam, 48, 123, 360 Awak, 157, 160, 365 Awutu, 45<sup>5</sup> , 360 Ayere, 21, 78, 322, 354 Ayu, 19, 95–98, 100, 107, 109, 324, 357 Ɓa, 153, 365 Baatonum, 52, 186, 278, 370 Bafanji, 322, 351 Bafia, 69, 69<sup>6</sup> , 338 Bafo, 287, 337 Bafut, 69, 351 Baga Fore, 27, 28, 229, 236, 287, 375 Baga Koba, 229, 279, 369

Baga Maduri, 229, 369 Baga Mboteni, 229, 236, 237, 266, 287, 375 Baga Sitemu, 229, 279, 369 Bagirmi, 154, 155, 164, 169–171 Bago-Kusuntu, 185, 370 Baka, 259, 367 Bakaka, 315, 337 Bakoko, 315, 337 Bakwe, 140–143, 362 Balant, 15, 28, 42, 246, 247, 249, 250, 250<sup>37</sup> , 251, 279, 282, 284, 309, 310, 330, 375 Bali, 51, 148, 317, 342, 365 Bali (Kibali), 317, 342 Balong, 274, 337 Bamana, 183, 221, 226, 227, 373 Bamileke, 57–61, 63, 65, 66, 68, 70, 71, 275, 351 Bamun, 275, 351 Bamwe, 316, 341 Banda, 172, 175–181, 302, 303, 367 Bandawa, 267, 358 Bandi, 221, 224, 373 Bangala, 180, 316, 341 Bangime, 181, 183, 226, 303, 304, 363 Bangunji, 50, 157, 266, 365 Banjal, 26–28, 41, 242–245, 280, 375 Bankala, 267, 335 Bankon, 19, 63, 287, 315, 337 Baoule, 125, 360 Bapen, 5, 234, 266, 375 Barama, 316, 339 Bariba, 184, 186, 190, 191, 203–212, 278, 304, 305, 370 Barombi, 315, 337 Basa, 85–91, 93, 94, 106, 356 Basari, 5, 27, 28, 234, 375

Basila, 126, 127, 360 Bassa, 39, 140, 140<sup>14</sup> , 141–143, 299, 331, 362 Batanga, 315, 337 Baule, 45, 45<sup>3</sup> , 271, 287, 360 Bayanga, 259, 367 Bayot, 26–28, 41, 241–245, 310, 330, 375 Bayot (Guinea Bissau), 26 Bayot (Sénégal), 26, 375 Bebe, 321, 335 Bebil, 315, 338 Bedik, 234, 375 Befang, 57, 59–61, 63, 65, 66, 68, 70, 71, 335 Bekwarra, 105, 106, 333, 352 Bekwil, 19, 315, 338 Bemba, 320, 349 Ben Tey, 182, 368 Bena, 286, 318, 345 Bende, 285, 317, 344 Bendi, 73–77, 106, 290, 333, 352 Beng, 221, 373 Benga, 315, 337 Besme, 146, 155, 365 Bete, 18, 84, 106, 140–143, 321, 333, 352, 358, 362 Bete (Juk.), 84, 358 Bete-Bendi, 18, 106, 321, 333, 352 Bhele, 67, 317, 342 Biafada, 9, 13, 233, 234, 237–240, 279, 309, 310, 375 Biali, 191, 370 Bijogo, 26, 27, 42, 246–251, 279, 309, 375 Bimoba, 185, 188, 370 Birifor, 188, 287, 370 Birom, 95–100, 267, 288, 357

Bisa, 213, 217, 373 Bliss, 242–245, 375 Bobo, 213–220, 222–228, 307, 373 Boko, 4, 183, 213, 217, 224–226, 360, 373 Bokobaru, 217, 226, 373 Bokoto, 302, 367 Bokyi, 106, 283, 288, 321, 333, 352 Bolgo, 154, 276, 365 Bolondo, 316, 341 Bom, 276, 277, 369 Bomasa, 259, 367 Bomwali, 14, 315, 338 Bondei, 285, 345 Bongili, 316, 341 Bonkeng, 261, 337 Boyawa, 284, 357 Bozo, 183, 213–216, 218–220, 222, 223, 225–228, 307, 373 Bua, 153, 154, 159–171, 261, 276, 300– 302, 365 Bubi, 261, 266, 274, 337 Budu, 64, 317, 342 Budza, 22, 316, 341 Buji, 85–88, 91, 93, 94, 356 Bukusu, 319, 347 Buli, 52, 184, 188, 191, 192, 194, 260, 370 Bullom, 229, 230, 369 Bungu, 274, 344 Bunu, 85–88, 91, 93, 94, 107, 356 Burak, 50, 156, 160, 282, 365 Busa, 4, 40, 213, 217, 221, 224, 225, 330, 373 Bushong, 266, 316, 341 Bute, 39, 330, 335 Bwamu, 184, 187, 190, 191, 203–212, 278, 304, 370

Byep, 64, 315, 338 Cawai, 90, 356 Cebaara, 186, 370 Chaga, 283, 330, 343 Chakali, 185, 370 Chala, 52, 185, 187, 370 Chamba, 49, 50, 52, 57, 59–61, 63, 65– 68, 70, 71, 275, 330, 335 Chamba-Daka, 57, 59–61, 63, 65–68, 70, 71, 335 Cherepon, 45, 360 Chiga, 15, 25, 319, 346 Chuka, 317, 343 Chumburung, 45<sup>5</sup> , 360 Ciluba, 320, 348 Cilungu, 20, 21, 320, 349 Cinda, 88, 356 Dadiya, 50, 157, 266, 365 Dagaara, 52, 188, 370 Dagbani, 188, 267, 370 Dagik, 145<sup>17</sup> , 280, 280<sup>2</sup> Dama, 51, 365 Dan, 225, 337, 371, 373, 376, 377 Dangme, ix, 44, 47, 49, 118, 119, 127– 137, 297, 298, 360 Darangi, 85–89, 91, 93, 94, 356 Day, 146, 149, 153, 155, 156, 159–171, 284, 301, 302, 365 Defaka, 138, 139, 276, 299, 361 Deg, 185, 188, 189, 370 Degema, 32, 354 Delo, 52, 185, 284, 370 Dendi, 74 Denya, 18, 323, 335 Dewoin, 140<sup>14</sup> , 362 Dida, 140–143, 219, 280, 291, 362 Digo, 274, 343

Dii, 282, 365 Dijim, 157, 365 Dinaoro, 202, 370 Dirrim, 37, 38, 50, 275, 365 Ditammari, 30, 52, 186, 191, 267, 370 Djimini, 189, 370 Dogose, 52, 187, 189, 190, 195, 203– 212, 304, 330, 370 Dogoso, 184, 187, 189, 190, 195, 203– 212, 304, 370 Dogulu Dom, 183, 278, 368 Donno So, 183, 278, 368 Doyayo, 147, 365 Duala, 315, 337 Duka, 85–91, 93, 94, 356 Dukku, 85–88, 91, 93, 94, 356 Dumbo, 275, 335 Duru, 147, 148, 151, 152, 155, 159–171, 282, 283, 300, 301, 365 Duungoma, 225, 373 Duupa, 51, 365 Dwang, 45, 360 Dyan, 184<sup>22</sup> , 185, 186, 190, 198, 203– 212, 305, 370 Dzuun, 214–216, 218–220, 222, 223, 225–228, 373 Ebira, 102, 324, 358 Ebrie, 46, 49, 124, 360 Ebughu, 19, 292, 321, 334, 352 Ede, 18, 322, 354 Edo, 288, 322, 354 Efai, 292, 334, 352 Efik, 32, 321, 334, 352 Ega, 45, 46, 126–137, 360 Eggon, 20, 95–100, 107, 109, 324, 325, 357 Ejagham, 59, 62, 322, 335 Ejamat, 41, 233, 233<sup>34</sup> , 242–245, 375

Ekajuk, 59, 62, 322, 335 Ekit, 292, 334, 352 Ekoi, 267, 283, 288, 335 Ekpeye, 19, 83, 323, 355 Eleme, 21, 22, 321, 334, 352 Elip, 61, 335 Eloyi, 18, 106, 107, 275, 292, 322, 355 Embu, 261, 283, 288, 317, 343 Enenga, 284, 339 Engenni, 322, 354 Enwang, 292, 334, 352 Enya, 283, 342 Eotile, 45, 124, 271, 360 Esan, 322, 354 Esimbi, 57, 59–61, 63, 65, 66, 68, 70– 72, 105–107, 283, 335 Etebi, 292, 334, 352 Eten, 95–100, 109, 357 Etulo, 355 Etulo o-ɲiī, 107 Ewe, 47, 119, 120, 196, 276, 360 Ewondo, 261, 338 Fali, 51, 149, 150, 159–171, 283, 301, 365 Fam, 2, 57, 59–61, 63, 65, 66, 68, 336 Fang, 261, 266, 287, 338 Faniagara, 201, 202, 370 Fanya, 51, 154, 365 Farefare, 188, 370 Fefe, 58, 351 Feloup, 251<sup>38</sup> , 375 Fio, 58, 336 Fipa, 317, 344 Fogny, 41, 241–245, 280, 330, 375 Fon, 43, 47, 276, 360 Fon-Gbe, 276, 360 Foodo, 45, 287, 360 French,155,172,174,175,181, 233, 246

Fula, 5, 14<sup>2</sup> , 150–152, 155, 163, 170, 171, 181, 183, 231, 234, 234<sup>35</sup> , 235, 237, 239, 240, 245, 266, 308, 309, 375 Fulfulde, 240, 375 Fuliiru, 62, 64, 347 Fungwa, 85–88, 91, 93, 94, 356 Fyam, 95–98, 100, 357 Ga, ix, 44, 47, 49, 118, 119, 127–137, 297, 298, 360 Gade, 275, 355 Galke, 51, 266, 365 Galwa, 261, 339 Ganda, 25, 346 Gandole, 275, 365 Ganja, 246, 247, 251, 310, 375 Gbanzili, 284, 367 Gbari, 102, 324, 358 Gbaya, 51, 54, 172, 173<sup>19</sup> , 175–181, 272, 302, 303, 331, 367 Gbaya Mbodomo, 172, 367 Gbaya-Bossangoa, 367 Gbe, 39, 43, 44, 47, 49, 119, 120, 127– 137, 297, 298, 331, 360 Gbii, 140<sup>14</sup> , 362 Gen, 47, 360 Ghomala, 21, 322, 351 Ghotuo, 322, 354 Gikuyu, 317, 343 Gimme, 49, 50, 147, 330, 365 Ginyanga, 45, 360 Giro, 85–88, 91, 93, 94, 356 Gitonga, 19, 321, 350 Glio-Oubi, 140<sup>15</sup> , 291, 362 Godié, 140–143, 362 Gogo, 288, 318, 345 Gokana, 334, 352

Gola, 229, 252, 253, 257, 258, 260, 266, 269, 278, 289, 310–313, 363 Gongwe, 317, 344 Grebo, 39, 140, 140<sup>15</sup> , 141–143, 299, 362 Guang, 45, 45<sup>5</sup> , 49, 126–137, 360 Gula, 154, 272, 276, 365 Gundi, 259, 367 Gundu, 25, 32, 33, 346 Gungu, 319 Gure, 90, 356 Gurma, 30, 31, 185–188, 193, 194, 260, 278, 370 Gurmana, 107, 356 Guro, 213, 373 Gusii, 274, 317, 347 Gusilay, 41, 280, 376 Gwa, 4, 336 Gwari, 4, 107, 358 Gweno, 39, 317, 343 Gwere, 11, 33, 283, 319, 346 Gyele, 315, 338 Gyem, 90, 356 Gã, 4, 365 Gəunəm, 147, 283, 365 Ha, 54, 319, 331, 347 Hanga, 52, 188, 267, 370 Hasha, 95–98, 100, 357 Hausa, 84, 94, 151, 152, 158, 166, 167, 171, 235, 240, 292 Haya, 274, 319, 346 Hehe, 22, 286, 318, 345 Heiban, 144, 145, 145<sup>17</sup> , 284, 299, 300, 331 Hema, 25, 33, 319, 346 Herero, 274, 350 Holoholo, 274, 342 Horom, 275, 357

Hun-Saare, 85–88, 91, 93, 94, 356 Hunde, 285, 347 Hungworo, 85–88, 90, 91, 93, 94, 356 Hyam, 95–98, 100, 108, 357 Ibani, 4, 5, 138, 139, 361 Ibibio, 267, 334, 352 Ibino, 334, 352 Ibuoro, 334, 352 Icheve, 74, 108, 321, 325, 333, 352 Idakho, 319, 347 Idoma, 107, 322, 355 Idong, 284, 357 Ifè, 18, 322, 354 Igala, 78–81, 322, 354 Igbo, 323, 355 Igo, 48, 276, 360 Iguta, 85–88, 91, 93, 94, 356 Ijaw, 138, 139, 361 Ikaan, 2, 73, 103, 106, 107, 109–117, 275, 282, 283, 296, 359 Iko, 334, 352 Ikom, 333, 353 Ikoma, 317, 342 Ikposo, 47, 120, 360 Ikulu, 95–100, 107, 284, 324, 357 Ikwere, 107, 323, 355 Ilue, 292, 334, 353 Ipulo, 19, 67, 324, 336 Iri, 85–88, 91, 93, 94, 356 Irigwe, 95–98, 100, 275, 357 Íṣiēle, 83, 355 Īsóāma, 83, 355 Isoko, 322, 354 Itu, 334, 353 Ivbie, 322, 354 Izere, 95–98, 100, 357 Izi, 323, 355

Jaad, 9, 27, 138, 233, 234, 237–240, 279, 309, 310, 376 Jalonke, 213, 215, 373 Jamsay, 278, 368 Janji, 85–88, 91, 93, 94, 356 Jarawa, 275, 336 Jenjo, 156, 365 Jibu, 108, 275, 358 Jiru, 267, 336 Jita, 19, 319, 347 Jomang, 53, 145<sup>17</sup> , 330 Joola, 28, 41–43, 233, 241–245, 249, 250, 250<sup>37</sup> , 251, 251<sup>38</sup> , 280, 309, 310, 376 Jowulu, 140, 213–216, 218–220, 222– 228, 306, 307, 373 Jukun, 106, 275, 358 Jula, 220, 374 Jwira, 125, 360 Kaan, 157, 168, 365 Kaansá, 189, 370 Kabiye, 185, 278, 370 Kabwa, 317, 342 Kadara, 284, 357 Kahe, 261, 274, 343 Kaje, 95–98, 100, 357 Kakabe, 220, 374 Kakanda, 102, 324, 358 Kako, 315, 338 Kalanga, 18, 321, 350 Kam, 51, 150, 151, 159–171, 266, 300, 301, 365, 370 Kamara, 188, 370 Kamba, 283, 288, 343 Kambali, 85–88, 91, 93, 94, 356 Kami, 285, 345 Kande, 261, 339 Kantosi, 188, 370

Kanuri, 167 Kanyok, 320, 348 Kapya, 323, 358 Kara, 319, 346 Karaboro, 187, 370 Karang, 287, 366 Karon, 26, 27, 41, 242–245, 376 Kasa, 26, 27, 41, 241–245, 280, 330, 376 Kasanga, 233, 266, 280, 376 Kasem, 187, 371 Katla, 53, 144, 145, 145<sup>17</sup> , 299, 300, 330, 331 Kawara, 202, 371 Kebu, 47, 120, 276, 360 Keeraak, 242–245, 280, 376 Kela, 285, 341 Kele, 274, 339 Kentohe, 42, 246, 247, 279, 376 Kenyang, 62, 323, 325, 336 Kete, 261, 348 Kgalagadi, 18, 321, 350 Khana, 334, 353 Khe, 184, 187, 189, 190, 195, 203–212, 304, 371 Khisa, 52, 187, 330, 371 Khumbi, 64, 350 Kiamba, 278, 371 Kikamba, 317, 343 Kikongo, 22, 286, 318, 346 Kikuyu, 23, 283, 288, 343 Kila, 39, 330, 336 Kim, 146, 155, 159–171, 300–302, 366 Kimbu, 285, 344 Kiong, 275, 353 Kirma, 184, 187, 190, 197, 203–212, 304, 371 Kisanga, 320, 348

Kisi, 229, 230, 369 Kizeela, 320, 348 Kiɔŋ, 333, 353 Klao, 140–143, 299, 362 Koalib, 31, 145<sup>17</sup> Kobiana, 11, 233, 233<sup>34</sup> , 245, 280, 376 Kodia, 140–143, 362 Kohumono, 267, 353 Koke, 154, 276, 366 Kol, 63, 338 Kolbila, 50, 152, 330, 366 Kolum So, 278, 368 Kom, 267, 274, 275, 285, 291, 345, 351 Komo, 69, 342 Komoro, 274, 285, 345 Konkomba, 186, 371 Konni, 184, 371 Kono, 213–216, 218–220, 222–228, 307, 374 Konongo, 317, 344 Konyagi, 28, 29, 234, 237, 376 Konzo, 261, 285, 347 Koongo, 64, 346 Koonzime, 19, 315, 338 Koring, 321, 333, 353 Korop, 275, 321, 333, 353 Kota, 287, 339 Kotafon, 47, 120, 360 Kotopo, 51, 366 Koyo, 261, 341 Kpa, 261, 266, 274, 338 Kpelle, 227, 374 Kplang, 45<sup>5</sup> , 360 Krache, 45<sup>5</sup> , 361 Krahn, 140<sup>16</sup> , 362 Krim, 38, 229, 369 Krobu, 45, 46, 49, 126–137, 361 Krumen, 140<sup>15</sup> , 291, 362

Kugbo, 267, 353 Kukele, 283, 321, 333, 353 Kulaal, 266, 366 Kulaal(Gula), 154 Kulango, 184, 186, 190, 197, 198, 203– 212, 267, 304, 305, 371 Kulung, 284, 336 Kumba, 51, 266, 366 Kuranko, 220, 374 Kurumfe, 184, 190–192, 203–212, 304, 371 Kusu, 285, 341 Kuteb, 84, 290, 323, 358 Kutu, 285, 345 Kuwa, 140–143, 299, 362 Kwa, 23, 31, 39, 43, 44, 47–50, 52, 54, 118, 127–137, 199, 211, 255, 271, 272, 276–278, 281–284, 287–289, 291, 292, 297, 298, 302, 312, 331, 359, 366 Kwaatay, 26–28, 241–245, 280, 376 Kwakum, 315, 338 Kwanka, 284, 357 Kwaya, 319, 346 Kyanga, 217, 224, 374 Kélé, 19, 316, 339 Laal, 5, 149, 157–171, 232, 257, 258, 260, 269, 283, 289, 292, 300– 302, 312, 313, 363, 376 Laala, 5, 232, 376 Lafofa, 145<sup>17</sup> Laimbue, 14, 351 Lama, 185, 371 Lamba, 283, 371 Landuma, 229, 279, 281, 308, 369 Laro, 145<sup>17</sup> Larteh, 45, 361 Lega, 261, 342

Leggbo, 19, 321, 333, 353 Lele, 220, 220<sup>28</sup> , 316, 374 Lelemi, 46, 48, 121, 284, 361 Lengola, 317, 342 Lenje, 320, 349 Ligbi, 225, 374 Lijili, 21, 22, 95–98, 100, 324, 357 Lika, 317, 342 Likile, 11, 341 Likpe, 121, 361 Limba, 28, 229, 252, 253, 257, 258, 260, 269, 280, 291, 311, 312, 363 Limbum, 32, 69, 351 Lingala, 172–174, 180, 181, 292, 316, 341 Lobala, 316, 341 Lobi, 184, 184<sup>22</sup> , 185, 186, 190, 198, 203–212, 305, 371 Lobi (Lobiri), 198, 371 Logba, 46, 121, 361 Logol, 145<sup>17</sup> Logooli, 319, 347 Lokaa, 333, 353 Lombi, 261, 337 Longto, 147, 148, 366 Longuda, 146, 147, 156, 159–171, 301, 366 Longurama, 146, 366 Looma, 221, 287, 374 Lorhon, 267, 371 Lozi, 261, 266, 350 Lua, 153, 366 Luba-Katanga, 64, 348 Lubwisi, 319, 347 Lufu, 2, 73,104,106,107, 296, 297, 359 Luganda, 261, 271, 347 Luguru, 318, 345

Luhya, 285, 347 Lulamoji, 11, 35, 347 Lumbu, 283, 285, 316, 339 Lumun, 145<sup>17</sup> , 300 Lunda, 319, 348 Lundu, 261, 266, 337 Luyia, 319, 347 Lyele, 187, 283, 371 Lyive, 62, 67, 336 Láá Láá, 187, 371 Mabo, 287, 357 Machame, 317, 343 Mada, 95–100, 108, 357 Makonde, 19, 320, 349 Malila, 64, 277, 320, 349 Mama, 39, 54, 330, 336 Mamara, 189, 371 Mambai, 276, 366 Mambila, 72, 105, 267, 275, 336 Mambwe, 285, 344 Mampruli, 188, 371 Manda, 274, 286, 349 Mandinka, 287, 374 Mangbai, 266, 366 Mani, 229, 230, 369 Manjak, 5,13, 42, 233<sup>34</sup> , 234, 239, 245, 246, 249, 250, 250<sup>37</sup> , 251, 251<sup>38</sup> , 266, 279–281, 309, 310, 376 Mankanya, 27, 245, 279–281, 310, 376 Mano, 225, 374 Marka Dafing, 183, 374 Masaba, 285, 319, 347 Mashami, 291, 343 Matengo, 286, 320, 349 Matuumbi, 286, 349 Matya Samo, 214, 215, 374 Maxi-Gbe, 120, 361

Maya Samo, 217, 374 Mba, 172, 174–181, 284, 302, 303, 367 Mbala, 64, 348 Mbangwe, 63, 339 Mbanza, 172, 367 Mbato, 46, 124, 287, 361 Mbe, 2, 18, 21, 56, 57, 59–63, 65, 66, 68, 69<sup>6</sup> , 70, 71, 267, 275, 324, 336 Mbelime, 52, 54, 188, 191, 331, 371 Mbembe, 283, 321, 333, 353 Mbere, 63, 261, 266, 340 Mboa, 275, 336 Mbosi, 316, 341 Mbowe, 319, 348 Mbọfīa, 83, 355 Mbugu, 286, 345 Mbugwe, 285, 344 Mbukushu, 319, 348 Mbula-Bwazza, 283, 336 Mbule, 323, 336 Mbum,146,153,155,156,159–171, 276, 284, 300, 301, 366 Mbunda, 19, 319, 348 Mbuun, 20, 316, 340 Mbwela, 261, 274, 348 Mbwera, 274, 348 Medumba, 58, 351 Mende, 226<sup>32</sup> , 374 Mengisa, 315, 338 Meru, 291, 317, 343 Miyobe, 30, 185, 187, 371 Mlomp, 41, 242–245, 280, 330, 376 Mmen, 39, 275, 330, 351 Moba, 188, 371 Mochi, 317, 343 Moghamo, 14, 322, 351 Mom Jango, 37, 38, 51, 266, 331, 366

Mombo, 181, 183, 368 Momi, 147, 148, 366 Mongo-Nkundu, 316, 341 Mono, 51, 367 Moore, 188, 212, 371 Moro, 31, 145<sup>17</sup> Morwa, 267, 275, 357 Mosi, 278, 371 Mpiin, 22, 316, 340 Mpoto, 286, 349 Mpumpong, 35, 338 Mpur, 291, 340 Mumuye, 7, 148, 151, 153, 159–171, 300, 301, 366 Mundang, 51, 272, 366 Mundani, 14, 19, 21, 322, 351 Munga, 289, 366 Mungaka, 322, 351 Mushunguli, 318, 345 Mwan, 213, 374 Mwenyi, 274, 348 Mwesa, 316, 339 Myene, 22, 23, 23<sup>4</sup> , 24, 25, 64, 284, 339 Nafaanra, 187, 371 Najamba, 181, 182, 368 Naki, 321, 336 Nalu, 28, 29, 236, 237, 239, 240, 266, 308, 309, 376 Nande, 22, 285, 319, 347 Nata, 317, 343 Nateni, 185, 187, 267, 371 Natioro, 184, 186, 189, 190, 201, 201<sup>23</sup> , 202–212, 304, 305, 371 Nawdm, 52, 184, 188, 193, 194, 260, 330, 371 Nawuri, 45<sup>5</sup> , 361 Nchane, 58, 336

Nchumburu, 45<sup>5</sup> , 361 Ndali, 19, 320, 349 Ndamba, 286, 318, 345 Ndambomo, 287, 339 Nde-Ndele, 21, 322, 336 Ndemli, 2, 57, 59–61, 63, 65, 66, 68, 70, 71, 324, 336 Ndengese, 32, 63, 316, 341 Nding, 53, 145<sup>17</sup> , 330 Ndoe, 23, 283, 288, 336 Ndogo, 52, 54, 330, 367 Ndut, 232, 376 Negeni, 202, 371 Nembe, 139, 284, 361 Neyo, 140–143, 362 Ngangam, 185, 186, 371 Ngbaka, 172, 174–181, 284, 302, 303, 368 Ngbandi, 172, 173, 175–181, 302, 303, 368 Ngemba, 57, 59–61, 63, 65, 66, 68, 70, 71, 351 Ngie, 322, 351 Ngiemboon, 322, 325, 351 Ngindo, 286, 349 Ngomba, 22, 322, 351 Ngombe, 259, 316, 341 Ngoreme, 19, 317, 342 Ngul, 316, 340 Ngulu, 285, 345 Ngumba, 315, 338 Ngungwel, 316, 340 Ngwe, 267, 351 Ngwoi, 90, 356 Niansogoni, 202, 372 Niellim, 153, 154, 169, 284, 366 Nilamba, 21, 317, 344 Nimbari, 147, 148, 151, 153, 284, 366

Ninzo, 4, 95–100, 108, 357 Nkem, 59, 62, 267, 275, 283, 288, 336 Nkem-Nkum, 59, 62, 336 Nki, 333, 353 Nkonya, 45<sup>5</sup> , 361 Nkore-Kiga, 283, 347 Nkoya, 266, 320, 348 Nkumbi, 320, 350 Nomaande, 18, 62, 323, 336 Noon, 232, 376 Notre, 184, 372 Nsong, 14, 316, 340 Ntcham, 31, 186, 372 Ntumbede, 19, 316, 339 Nubaca, 18, 67, 323, 336 Nugunu, 18, 323, 336 Nulibie, 323, 336 Numaala, 18, 323, 336 Nungu, 4, 357 Nuni, 33, 187, 372 Nupe, 102, 358 Nyabwa, 140<sup>16</sup> , 362 Nyakyusa, 320, 349 Nyali, 317, 342 Nyambo, 319, 346 Nyamwanga, 320, 349 Nyamwezi, 285, 288, 344 Nyaneka, 320, 350 Nyangbo, 48, 120, 361 Nyanja, 19, 320, 349 Nyankole, 261, 283, 319, 346 Nyarafolo, 189, 372 Nyaturu, 317, 344 Nyemba, 64, 348 Nyengo, 261, 266, 348 Nyole, 18, 319, 347 Nyore, 22, 319, 347 Nyoro, 283, 319, 346

Nyun, 26–28, 43, 54, 232, 233, 237, 239, 240, 266, 279, 309, 330, 377 Nyun Djibonker, 279, 377 Nyun Gubëeher, 330, 377 Nyun Gujaxer, 279, 377 Nyun Gunyamolo, 27, 377 Nzadi, 266, 316, 340 Nzema, 43, 45, 45<sup>2</sup> , 125, 361 Nɡonɡo, 21, 318, 346 Obolo, 334, 353 Odual, 321, 334, 353 Ogbia, 321, 334, 353 Ogbronuagum, 334, 353 Ogoni, 73–77, 108, 334, 353 Okam, 333, 353 Oko, 2, 73, 104, 106–117, 296, 324, 359 Okobo, 292, 334, 353 Okpamheri, 23, 275, 283, 288, 354 Oloma, 267, 354 Olulumo, 267, 275, 353 Ombo, 285, 341 Orig, 53, 145<sup>17</sup> , 330 Oro, 19, 292, 321, 334, 353 Oroko, 315, 337 Orungu, 266, 284, 339 Paasaal, 187, 372 Palaka, 189, 372 Palor, 27, 43, 232, 330, 377 Palɛn, 201, 202, 372 Pam, 51, 366 Pambia, 52, 54, 330, 368 Pangwa, 285, 318, 345 Paɡibete, 21, 316, 341 PB, ix, 5, 58, 60, 63, 257, 261, 273, 285 Peere, 50, 54, 147, 330, 366 Pemba, 261, 345

Pepel, 28, 42, 245, 246, 252, 279, 282, 310, 330, 377 Pere, 37, 38, 289, 366 Perge Tegu, 182, 368 Phende, 319, 348 Phuie, 185, 187, 372 Pimbwe, 285, 344 Pinji, 14, 339 Piti, 90, 356 PLC, ix, 106 Pogoro, 274, 286, 345 Pokomo, 261, 285, 317, 343 Pongu, 85–88, 91, 93, 94, 356 PP, ix, 106 Proto-Adamawa, 148, 149, 158, 162, 166, 167, 170, 259, 290, 331 Proto-Agneby, 122 Proto-Atlantic, 6, 8, 9, 12, 26, 43, 106, 237, 280, 308, 309 Proto-Bak, 8–10, 248, 252, 308, 310 Proto-Bak-Atlantic, 252 Proto-Balant, 8 Proto-Balto-Slavic, 7 Proto-Bantoid, 1, 60, 62, 64, 67, 67<sup>5</sup> , 69, 72, 73, 112, 114–117 Proto-Bantu, ix, 5, 14<sup>2</sup> , 20, 23, 26, 44, 56–58, 64, 67<sup>5</sup> , 109, 130, 257, 259, 272, 273, 277, 286, 313 Proto-Benue-Congo, 2, 58, 104, 111, 114, 117, 118 Proto-Bia, 125 Proto-Bijogo, 8 Proto-Cangin, 8, 43, 231, 232, 331 Proto-Cross, 76, 77 Proto-Dogon, 53, 331 Proto-Duru, 148 Proto-Eastern Bantoid, 105 Proto-Eastern Grassfields, 58

Proto-Eastern Mande, 280 Proto-Eastern-Benue-Congo, 1 Proto-Edoid, 82, 275 Proto-Fula-Sereer, 8, 235, 309 Proto-Gbaya, 51, 219, 280, 291 Proto-Gbe, 119 Proto-Grusi, 197, 203–212, 305 Proto-Gur, 203, 207, 210, 305, 331 Proto-Ikaan, 103 Proto-Jaad-Biafada, 8, 10, 238, 239 Proto-Joola, 8, 251, 280, 310 Proto-Joola-Bayot, 8 Proto-Jukunoid, 5, 84, 108, 275 Proto-Ka-Togo, 120, 297 Proto-Kainji, 87, 89, 90, 92 Proto-Kim, 147 Proto-Kordofanian, 144, 289 Proto-Kru, 139–142 Proto-Kwa, 118, 126, 128, 129, 131–133, 135, 137, 297 Proto-Leko-Nimbari, 148 Proto-Longuda, 147 Proto-Lower Cross, ix, 74, 106 Proto-Mande, 215, 219, 221, 222, 225, 226, 228, 306 Proto-Manjak-Mankanya-Pepel, 8 Proto-Mel, 231, 308 Proto-Mumue-Yandang, 153 Proto-Na-Togo, 46, 121 Proto-Nalu-Baga Fore-Baga Mboteni, 8 Proto-NC, 5, 9, 81, 139, 238, 252, 267, 269, 281, 286, 291, 296–299, 304, 306, 308, 310–313 Proto-Niger-Congo, 6, 8, 9, 13, 23, 28, 31, 35, 254, 288, 293, 296, 299 Proto-Northern Atlantic, 8, 9, 238

Proto-Nothern Mel, 230 Proto-Nyo, 132, 297 Proto-Oti-Volta, 194, 203–212, 260, 305 Proto-Platoid, ix, 95–97, 99, 101, 106, 108 Proto-Potou-Akanic-Bantu, 270 Proto-Potou-Tano, ix, 44, 270, 331 Proto-Potou-Tano-Congo, 270 Proto-South-Eastern Mande, 291 Proto-South-Mel, 230 Proto-Tenda, 8, 10, 234 Proto-Ubangi, 175, 177, 179–181, 303 Proto-Upper Cross, ix, 74, 106, 108 Proto-Waja, 157 Proto-Western Mande, 222, 223, 225 Proto-Western-BC, 297 Proto-Western-Benue-Congo, 1 Proto-Wolof, 8 Proto-Yoruba-Igala, 78, 79 PTB, ix, 44 PUC, ix, 105, 106 Punu, 274, 283, 285, 316, 339 Pyem, 106, 357 Rangi, 317, 344 Rere, 145<sup>17</sup> , 331 Reshe, 85–88, 91, 93, 94, 356 Rijau, 85–88, 91, 93, 94, 357 Ring, 57–61, 63, 65, 66, 68, 70, 71, 351, 359–361, 371 Rombo, 272, 343 Ronga, 274, 350 Rukuba, 95–100, 108, 358 Rundi, 22, 266, 319, 347 Rungu, 285, 349 Russian, 7 Rwa, 317, 343 Rwanda, 274, 319, 347

Rwila, 317, 344 Safaliba, 52, 188, 372 Safin, 232, 377 Sakata, 20, 21, 261, 266, 316, 341 Sake, 22, 316, 339 Samba Leko, 50, 152, 366 Sambe, 95–100, 358 Samo, 213–215, 331, 374 San, 40, 213, 374 Sango, 172, 174, 175, 180, 181, 283, 368 Sangu, 316, 339 Sapo, 140<sup>16</sup> , 362 Saxwe, 47, 361 SE, ix, 214–216, 218–223, 225–228, 273, 277, 307 Seenku, 49, 222, 374 Sefwi, 125, 361 Sekpele, 46, 361 Selee, 46, 361 Seme, 140–143, 299, 362 Senari, 267, 372 Sengele, 32, 63, 316, 341 Sere, 51, 52, 172, 174–181, 284, 303, 368 Sereer, 12, 43, 234, 235, 237, 239, 240, 266, 280, 308, 309, 330, 377 Sesotho, 274, 350 Shambala, 285, 318, 345 Shanga, 217, 374 Shempire, 186, 189, 372 Sherbro, 229, 266, 369 Shi, 25, 62, 64, 319, 347 Shirumba, 145<sup>17</sup> Simbiti, 21, 317, 347 Sira, 261, 283, 285, 316, 339 Sisaala, 185, 187, 278, 372 Siwu, 46, 48, 361 Sizaki, 317, 342

So, 10, 57, 58<sup>3</sup> , 229, 261, 266, 280, 338 Soga, 25, 34, 35, 283, 319, 347 Songo, 22, 316, 340 Songye, 320, 348 Soninke, 40, 41, 53, 54, 213–216, 218– 220, 222, 223, 225–228, 307, 330, 374 Sourani, 202, 372 Sua, 27, 43, 229, 252, 253, 257, 258, 260, 266, 269, 280, 292, 310, 312, 330, 363 Suba, 317, 342 Subiya, 319, 348 Suga, 69, 336 Sukuma, 261, 285, 288, 317, 344 Sumbwa, 285, 344 Supyire, 188, 372 Surubu, 90, 357 Susu, 213–216, 218–220, 222–229, 236, 307, 374 Swahili, 274, 285, 345 Swazi, 261, 350 SWM, ix, 214–216, 218–220, 222–228, 307 Syer, 199, 372 Sìcìté, 188, 372 Tagbu, 52, 368 Tagoi, 4, 145<sup>17</sup> Tagwana, 189, 372 Taita, 274, 343 Tajuasohn, 140–143, 362 Talodi, 53, 144, 145, 145<sup>17</sup> , 299, 300, 330 Tampulma, 187, 284, 372 Tanda, 234, 266, 377 Taram, 51, 366 Tarok, 95–98, 100, 109, 358

Teen, 184, 186, 190, 200, 203–212, 287, 304, 305, 372 Tegali, 145<sup>17</sup> Tegem, 53, 145<sup>17</sup> , 330 Teke-Nzikou, 316, 340 Teke-Tege, 63, 316, 340 Teke-Tyee, 22, 316, 340 Tem, 278, 372 Tembo, 22, 32–34, 319, 347 Teme, 51, 366, 368, 369 Temne, 29, 229, 266, 279–281, 369 Temne tɔ-f-ʌt, 308 Tene Kan, 287, 368 Tenyer, 199, 372 Tesu, 95–100, 358 Tetela, 15–17, 22, 316, 341 Tiba, 2, 57, 59–61, 63, 65, 66, 68, 72, 105, 106, 336 Tiefo,184,187,190, 200, 203–212, 305, 372 Tiene, 32, 63, 316, 340 Tikar, 2, 56<sup>2</sup> , 57, 59–61, 63, 65, 66, 68, 70, 71, 337 Tikuu, 261, 274, 285, 345 Tima, 53, 145<sup>17</sup> , 331 Timba, 202, 372 Tira, 145<sup>17</sup> Tiv, 39, 62, 67, 275, 324, 330, 337 Tocho, 53, 145<sup>17</sup> , 330 Tommo So, 53, 182, 278, 368 Toro So, 183, 278, 369 Toussian, 189, 372 Tsishingini, 85–89, 91, 93, 94, 357 Tubeta, 285, 288, 343 Tuki, 18, 323, 337 Tula, 50, 157, 290, 366 Tumbuka, 19, 320, 349 Tunen, 62, 274, 323, 337

Tunya, 37, 154, 168, 266, 366 Tuotomb, 18, 323, 337 Tupuri, 272, 366 Tura, 222, 224, 374 Turka, 187, 372 Tusia, 184, 189, 190, 200, 203–212, 305, 372 Tuwuli, 48, 276, 361 Twendi, 72, 105, 337 Twi, 44, 125, 287, 361 Tyap, 18, 95–98, 100, 324, 358 Tyurama, 184, 187, 190, 197, 203–212, 372 Uda, 292, 321, 334, 353 Ufia, 267, 353 Ukue, 267, 275, 354 Ukwa, 334, 353 Umbundu, 320, 350 Urhobo, 283, 322, 354 Usakade, 19, 321, 334, 353 Ut-Ma'in, 85–88, 91, 93, 94, 357 Utoro, 145<sup>17</sup> Utɔnkɔn, 321, 333, 353 Vagla, 188, 189, 372 Vai, 213–216, 218–228, 287, 307, 374 Venda, 261, 266, 350 Vere, 37, 51, 330, 367 Viemo, 184, 187, 190, 201, 203–212, 267, 282, 305, 372 Vinza, 23, 285, 347 Vove, 285, 339 Vunjo, 317, 343 Vute, 290, 337 Vɔmnəm, 147, 366 Waama, 188, 191, 372 Waci-Gbe, 119, 361

Waja, 146, 156, 157, 159–171, 272, 284, 300–302, 367 Waka, 51, 367 Wali, 188, 372 Wan, 213, 374 Wané, 140–143, 362 Wapan, 275, 358 Wara, 184, 186, 189, 190, 201, 201<sup>23</sup> , 202–212, 304, 305, 373 Warnang, 145<sup>17</sup> , 284, 300 Winyé, 185, 187, 373 Wobe, 140<sup>16</sup> , 362 Wolof, 5, 10, 12, 37, 43, 231, 235–240, 245, 287, 308–310, 331, 377 Wom, 51, 367 Wumbvu, 316, 339 Xhosa, 20, 21, 321, 350 Xwla, 47, 361 Yaka, 64, 286, 341 Yakoma, 173, 368 Yala, 322, 355 Yambeta, 18, 323, 337 Yanda Dom, 182, 369 Yangben, 18, 323, 337 Yansi, 261, 266, 291, 340 Yao, 320, 349 Yaure, 213, 374 Yemba, 322, 325, 351 Yendang, 50, 51, 148, 266, 284, 331, 367 Yeskwa, 22, 95–100, 324, 358 Yeyi, 62, 350 Yingilum, 150, 160–171, 367 Yom, 184, 188, 193, 194, 260, 373 Yombe, 286, 346 Yorno So, 182, 183, 369 Yoruba, 78–81, 106, 322, 354

Yukuben, 84, 290, 358 Yungur, 157–171, 282, 284, 300–302, 367

Zan Gula, 154, 367 Zanaki, 285, 342 Zande, 51, 172, 175–181, 302, 303, 368 Zigula, 285, 288, 345 Zimba, 317, 342

# Did you like this book?

This book was brought to you for free

Please help us in providing free access to linguistic research worldwide. Visit http://www.langsci-press.org/donate to provide financial support or register as a community proofreader or typesetter at http://www.langsci-press.org/register.

## The numeral system of Proto-Niger-Congo

This book proposes the reconstruction of the Proto-Niger-Congo numeral system. The emphasis is placed on providing an exhaustive account of the distribution of forms by families, groups, and branches. The big data bases used for this purpose open prospects for both working with the distribution of words that do exist and with the distribution of gaps in postulated cognates. The distribution of filled cells and gaps is a useful tool for reconstruction.

Following an introduction in the first chapter, the second chapter of this book is devoted to the study of various uses of noun class markers in numeral terms. The third chapter deals with the alignment by analogy in numeral systems. Chapter 4 offers a stepby-step reconstruction of number systems of the proto-languages underlying each of the twelve major NC families, on the basis of the step-by-step-reconstruction of numerals within each family. Chapter 5 deals with the reconstruction of the Proto-Niger-Congo numeral system on the basis of the step-by-step-reconstructions offered in Chapter 4. Chapter 6 traces the history of the numerals of Proto-Niger-Congo, reconstructed in Chapter 5, in each individual family of languages.